Axon¶
What is Axon?¶
Axon is a production-ready memory management system for Large Language Model (LLM) applications. It provides intelligent multi-tier storage, policy-driven lifecycle management, and semantic recall with automatic compaction and summarization.
Key Features¶
Multi-Tier Memory Architecture
Automatically route memories across ephemeral, session, and persistent tiers based on importance and access patterns.
Policy-Driven Lifecycle
Define custom policies for TTL, capacity limits, promotion/demotion thresholds, and automatic summarization.
Semantic Search & Recall
Vector-based similarity search with metadata filtering across all tiers simultaneously.
Production-Grade Features
Built-in audit logging, PII detection, structured logging, and two-phase commit transactions.
Ecosystem Integration
First-class support for LangChain and LlamaIndex with native adapters.
Quick Example¶
from axon import MemorySystem
from axon.core.templates import balanced
# Create memory system with balanced configuration
system = MemorySystem(config=balanced())
# Store memories with automatic tier routing
await system.store(
"User prefers dark mode",
importance=0.8,
tags=["preference", "ui"]
)
# Recall memories semantically
results = await system.recall(
"What are the user's UI preferences?",
k=5
)
# Automatic compaction and summarization
await system.compact(tier="session", strategy="hybrid")
Architecture Overview¶
graph TB
A[Your LLM Application] --> B[MemorySystem API]
B --> C{Router}
C -->|importance < 0.3| D[Ephemeral Tier]
C -->|importance 0.3-0.7| E[Session Tier]
C -->|importance > 0.7| F[Persistent Tier]
D --> G[In-Memory / Redis]
E --> H[Redis / ChromaDB]
F --> I[Qdrant / Pinecone / ChromaDB]
J[PolicyEngine] -.->|promotion| C
J -.->|demotion| C
K[ScoringEngine] -.->|importance| J
style B fill:#4051B5,color:#fff
style C fill:#5C6BC0,color:#fff
style J fill:#00BCD4,color:#fff
style K fill:#00BCD4,color:#fff
Why Axon?¶
For LLM Applications¶
| Problem | Axon Solution |
|---|---|
| Token Limits | Automatic summarization and compaction keep context windows manageable |
| Cost | Intelligent tier routing reduces expensive vector DB operations |
| Session Management | Built-in session isolation with TTL and lifecycle policies |
| PII & Privacy | Automatic PII detection with configurable privacy levels |
| Observability | Structured logging and audit trails for compliance |
For Developers¶
- Simple API: Store, recall, forget - that's it
- Framework Agnostic: Use with any LLM framework or standalone
- Type Safe: Full type hints and Pydantic validation
- Async-First: Built on asyncio for high performance
- Extensible: Custom adapters, policies, and embedders
Installation¶
Core Concepts¶
Memory Tiers¶
Axon organizes memories into three tiers:
- Ephemeral: Short-lived, high-volume data (TTL-based)
- Session: Session-scoped context with summarization
- Persistent: Long-term semantic storage
Policies¶
Policies define lifecycle rules:
from axon.core.policies import SessionPolicy
policy = SessionPolicy(
ttl_minutes=60, # Session expires after 1 hour
max_items=100, # Limit to 100 memories
summarize_after=50, # Summarize when reaching 50 items
promote_threshold=0.8, # Promote high-importance memories
)
Routing¶
The Router automatically selects tiers based on:
- Explicit tier hints in metadata
- Importance score thresholds
- Access patterns (recency, frequency)
- Capacity constraints
Use Cases¶
Chatbot with Persistent Memory¶
from axon import MemorySystem
from axon.integrations.langchain import AxonChatMemory
from langchain_openai import ChatOpenAI
# Create memory-backed chatbot
memory = AxonChatMemory(system=MemorySystem(...))
llm = ChatOpenAI(model="gpt-4")
chain = LLMChain(llm=llm, memory=memory)
# Conversations persist across sessions
response = await chain.arun("What did we discuss last week?")
RAG with Multi-Tier Storage¶
from axon.integrations.llamaindex import AxonVectorStore
from llama_index.core import VectorStoreIndex
# Use Axon as LlamaIndex vector store
vector_store = AxonVectorStore(system=MemorySystem(...))
index = VectorStoreIndex.from_vector_store(vector_store)
# Query with automatic tier selection
query_engine = index.as_query_engine()
response = await query_engine.aquery("Explain quantum computing")
Audit-Compliant Memory¶
from axon.core import AuditLogger
# Enable audit logging for compliance
audit_logger = AuditLogger(max_events=10000, enable_rotation=True)
system = MemorySystem(config=config, audit_logger=audit_logger)
# All operations are automatically logged
await system.store("Sensitive user data", privacy_level=PrivacyLevel.RESTRICTED)
# Export audit trail
events = await system.export_audit_log(operation=OperationType.STORE)
What's Next?¶
-
Quick Start
Get up and running in 5 minutes with our quickstart guide.
-
Core Concepts
Learn about tiers, policies, routing, and lifecycle management.
-
API Reference
Comprehensive API documentation for all modules.
-
Deployment
Production deployment guides, monitoring, and best practices.
Community & Support¶
- GitHub: Report Issues
- Discussions: Join the Community
- Examples: Browse Examples
License¶
Axon is released under the MIT License.