Skip to content

📚 Traylinx Cortex API Reference

Version: 2.2.0
Base URL: http://localhost:8000
Last Updated: 2025-12-04

📚 Navigation: Main README | Docs Index | Quick Reference | Integration Guide


🔐 Authentication

Traylinx Cortex uses a secure, token-based authentication system similar to GitHub Personal Access Tokens.

How It Works

  1. Registration: Call POST /v1/users with your user details to receive an API token
  2. Token Storage: Store the token securely (it's only shown once)
  3. API Calls: Include the token in all subsequent requests via the Authorization header

Headers

Header Required Description
Authorization ✅ Yes (most endpoints) Bearer <token> - Your API token
X-Trace-ID ❌ Optional UUID for distributed tracing

Token Format

Tokens follow the format: ctx_<random_string> (e.g., ctx_abc123xyz...)

Security Features

  • Token Hashing: Tokens are hashed (SHA256) before storage - plaintext never stored
  • Encryption at Rest: Sensitive data (API keys) encrypted using Fernet
  • Ownership Isolation: Users can only access their own data

👤 User Management

Register User / Get Token

POST /v1/users

Creates a new user (or updates existing) and returns an API token. This is the entry point for new users.

Auth Required: ❌ No

Request Body:

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "email": "user@example.com",
  "first_name": "John",
  "last_name": "Doe",
  "switch_ai_api_key": "sk-your-api-key",
  "custom_attributes": {
    "company": "Acme Inc"
  },
  "token_name": "My Laptop"
}

Field Type Required Description
id UUID User ID (from your auth system)
email string User email address
first_name string First name
last_name string Last name
switch_ai_api_key string Switch.AI API key (encrypted at rest)
custom_attributes object Additional user metadata
token_name string Name for the token (default: "Initial Token")

Response (201 Created):

{
  "access_token": "ctx_abc123xyz...",
  "token_type": "Bearer"
}

⚠️ Important: Save this token immediately! It's only shown once.


Get Current User

GET /v1/users/me

Returns details of the authenticated user.

Auth Required: ✅ Yes

Response (200 OK):

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "email": "user@example.com",
  "first_name": "John",
  "last_name": "Doe",
  "is_active": true,
  "created_at": "2025-12-02T10:00:00Z",
  "has_api_key": true,
  "api_key_preview": "sk-...abc123"
}

Field Description
has_api_key Whether user has a Switch.AI API key stored
api_key_preview Masked preview of the key (null if no key set)

Update API Key

PUT /v1/users/me/api-key

Update or clear the user's Switch.AI API key.

Auth Required: ✅ Yes

Request Body:

{
  "api_key": "sk-new-api-key"
}

Response (200 OK):

{
  "message": "API key updated successfully",
  "has_api_key": true,
  "api_key_preview": "sk-...abc123"
}


Create API Token

POST /v1/users/me/tokens

Generate a new API token for the authenticated user. Useful for multiple devices or applications.

Auth Required: ✅ Yes

Request Body:

{
  "name": "Chrome on MacBook"
}

Response (201 Created):

{
  "access_token": "ctx_newtoken123...",
  "token_type": "Bearer"
}


List API Tokens

GET /v1/users/me/tokens

List all active tokens for the authenticated user. Token values are masked for security.

Auth Required: ✅ Yes

Response (200 OK):

{
  "tokens": [
    {
      "id": "token-uuid-1",
      "name": "My Laptop",
      "token_prefix": "ctx_abc123...",
      "last_used_at": "2025-12-02T12:00:00Z",
      "created_at": "2025-12-01T10:00:00Z"
    },
    {
      "id": "token-uuid-2",
      "name": "Mobile App",
      "token_prefix": "ctx_xyz789...",
      "last_used_at": null,
      "created_at": "2025-12-02T09:00:00Z"
    }
  ]
}


Revoke API Token

DELETE /v1/users/me/tokens/{token_id}

Revoke (delete) a specific API token. The token becomes immediately invalid.

Auth Required: ✅ Yes

Response (204 No Content)

Error Response (404 Not Found):

{
  "detail": "Token not found"
}


List Sessions

GET /v1/users/me/sessions

List all sessions for the current user with pagination.

Auth Required: ✅ Yes

Query Parameters: - limit: int (default 20) - offset: int (default 0) - search: string (optional, filter by title)

Response (200 OK):

{
  "sessions": [
    {
      "id": "session-uuid-1",
      "title": "Trip to Tokyo",
      "app_id": "my-app",
      "created_at": "2025-12-02T12:00:00Z",
      "updated_at": "2025-12-02T12:05:00Z",
      "message_count": 5
    }
  ],
  "total": 1,
  "limit": 20,
  "offset": 0
}


Get User Profile

GET /v1/users/me/profile

Get user profile facts.

Auth Required: ✅ Yes

Query Parameters: - app_id: string (default "default")

Response (200 OK):

{
  "user_id": "user-uuid",
  "app_id": "default",
  "facts": {
    "name": "John Doe",
    "preferences": "dark mode"
  },
  "updated_at": "2025-12-02T12:00:00Z"
}


Update User Profile

PATCH /v1/users/me/profile

Update user profile facts (merge with existing).

Auth Required: ✅ Yes

Request Body:

{
  "facts": {
    "location": "New York"
  }
}

Response (200 OK):

{
  "user_id": "user-uuid",
  "app_id": "default",
  "facts": {
    "name": "John Doe",
    "preferences": "dark mode",
    "location": "New York"
  },
  "updated_at": "2025-12-02T12:10:00Z"
}


Delete User Profile

DELETE /v1/users/me/profile

Delete user profile facts.

Auth Required: ✅ Yes

Query Parameters: - app_id: string (default "default")

Response (204 No Content)


Extract Facts from Content

POST /v1/users/me/profile/extract

Extract structured user facts from unstructured content using AI. This endpoint is generic and app-agnostic - any application (web, mobile, etc.) can send content in various formats.

Auth Required: ✅ Yes

Query Parameters: - app_id: string (default "default")

Request Body:

Accepts any combination of the following fields:

Field Type Description
text string Natural language text (e.g., "My name is Sebastian, I live in Berlin")
data object Structured JSON object with any fields (nested objects are flattened)
raw string Any raw content to analyze

Example Requests:

From natural language:

{
  "text": "I'm a software developer living in Madrid, Spain. I have a dog named Olivia."
}

From structured app data:

{
  "data": {
    "firstName": "John",
    "lastName": "Doe",
    "email": "john@example.com",
    "address": {
      "city": "Berlin",
      "country": "Germany"
    }
  }
}

From raw content:

{
  "raw": "User profile: name=Sebastian, location=Spain, job=developer"
}

Combined (all formats):

{
  "text": "I love hiking and photography",
  "data": {"hobby": "travel"},
  "raw": "favorite_color=blue"
}

Response (200 OK):

{
  "facts": {
    "firstName": "John",
    "lastName": "Doe",
    "email": "john@example.com",
    "city": "Berlin",
    "country": "Germany",
    "hobby": "hiking"
  }
}

LLM Prompt Used:

The endpoint uses this prompt to extract facts:

You are a fact extraction system. Extract user profile facts from the following content.

Return ONLY a valid JSON object with key-value pairs representing facts about the person.

Rules:
- Keys should be camelCase (e.g., "firstName", "dogName", "favoriteColor", "workLocation")
- Keep keys short, clear, and descriptive
- Values should be the extracted information as strings
- Only extract factual, personal information about the user
- Normalize similar fields (e.g., "first_name", "firstName", "name" -> use "name" or "firstName")
- For addresses, extract as separate fields: street, city, state, country, postalCode
- For dates, keep in ISO format if possible (YYYY-MM-DD)
- If no facts can be extracted, return an empty object {}
- Do NOT include sensitive data like passwords

Content to analyze:
{combined_content}

Return only the JSON object, no explanation or markdown:

Use Cases: - Import user profile from authentication systems - Parse natural language chat messages - Process form data from mobile apps - Extract facts from any structured or unstructured data

Error Handling (LLM Response Sanitization):

LLMs can return malformed responses. The endpoint uses a robust JSON sanitizer that handles:

Issue How It's Handled
Markdown code blocks Strips ```json ... ``` wrappers
Mixed text + JSON Extracts JSON object from surrounding text
Trailing commas Removes {"key": "value",}{"key": "value"}
Single quotes Converts Python-style 'key' to "key"
Unquoted keys Converts {key: "value"} to {"key": "value"}
Python booleans Converts True/False/None to true/false/null
Truncated JSON Attempts to close unclosed braces
Empty/whitespace Returns empty facts object {}
Complete failure Fallback to regex key-value extraction

Validation Applied: - Empty/null values are removed - Placeholder values (unknown, N/A) are removed - Keys limited to 64 characters - Values limited to 1000 characters - Invalid keys converted to camelCase

The endpoint never fails with 500 for LLM parsing issues - it gracefully returns {"facts": {}} if extraction fails completely.


🩺 Health & Observability

Get Basic Health

GET /health

Returns the basic status of the service.

Auth Required: ❌ No

Response (200 OK):

{
  "status": "ok"
}


Get Liveness Status

GET /health/live

Detailed check of all dependencies (PostgreSQL, Redis, LLM).

Auth Required: ❌ No

Response (200 OK):

{
  "status": "healthy",
  "checks": {
    "postgres": {
      "status": "healthy",
      "latency_ms": 1.2
    },
    "redis": {
      "status": "healthy",
      "latency_ms": 0.5
    },
    "llm": {
      "status": "healthy"
    }
  },
  "timestamp": "2025-12-02T12:00:00Z"
}

Possible Status Values: - healthy - All systems operational - degraded - Some systems have issues but service is functional - unhealthy - Critical systems are down


Get Readiness Status

GET /ready

Kubernetes readiness probe. Returns 200 if ready to serve traffic, 503 otherwise.

Auth Required: ❌ No

Response (200 OK):

{
  "ready": true,
  "message": "Service is ready"
}

Response (503 Service Unavailable):

{
  "ready": false,
  "message": "Service is not ready"
}


💬 Sessions

Create Session

POST /v1/session

Initialize a new conversation session.

Auth Required: ✅ Yes

Request Body:

{
  "app_id": "my-app",
  "metadata": {
    "source": "web",
    "language": "en"
  }
}

Response (201 Created):

{
  "session_id": "550e8400-e29b-41d4-a716-446655440000",
  "created_at": "2025-12-02T12:00:00Z"
}


Get Session Details

GET /v1/session/{session_id}

Retrieve details of a specific session.

Auth Required: ✅ Yes

Response (200 OK):

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "user_id": "user-123",
  "app_id": "my-app",
  "title": "Trip to Tokyo",
  "created_at": "2025-12-02T12:00:00Z",
  "updated_at": "2025-12-02T12:05:00Z",
  "metadata": {
    "source": "web"
  }
}


Update Session

PATCH /v1/session/{session_id}

Update session title or metadata.

Auth Required: ✅ Yes

Request Body:

{
  "title": "New Title",
  "metadata": {
    "new_field": "value"
  }
}

Response (200 OK):

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "title": "New Title",
  ...
}


Get Session History

GET /v1/session/{session_id}/history

Retrieve full message history for a session in chronological order.

Auth Required: ✅ Yes

Response (200 OK):

{
  "session_id": "550e8400-e29b-41d4-a716-446655440000",
  "messages": [
    {
      "id": "msg-uuid-1",
      "role": "user",
      "content": "Hello",
      "token_count": 2,
      "created_at": "2025-12-02T12:00:00Z"
    },
    {
      "id": "msg-uuid-2",
      "role": "assistant",
      "content": "Hi there! How can I help you today?",
      "token_count": 10,
      "created_at": "2025-12-02T12:00:01Z"
    }
  ]
}


Delete Session

DELETE /v1/session/{session_id}

Clears Short-Term Memory (STM) for the session. Long-Term Memory (LTM) is preserved.

Auth Required: ✅ Yes

Response (204 No Content)


🤖 Chat

Send Message

POST /v1/chat

Send a message to the AI agent. Supports both standard and streaming responses.

Auth Required: ✅ Yes

Request Body:

{
  "session_id": "550e8400-e29b-41d4-a716-446655440000",
  "message": "What is the capital of Japan?",
  "config": {
    "stream": false,
    "model_preference": "balanced",
    "switch_ai_api_key": null,
    "embedding_model": null
  }
}

Config Options:

Field Type Default Description
stream boolean false Enable SSE streaming
model_preference string "balanced" "fast", "balanced", or "powerful"
switch_ai_api_key string null Override API key (uses stored key if not provided)
embedding_model string null Embedding model (e.g., mistral-embed)

💡 API Key Fallback Chain: The chat endpoint uses API keys in this priority order: 1. switch_ai_api_key in config (highest priority) 2. User's stored key (set via PUT /v1/users/me/api-key) 3. SWITCH_AI_API_KEY environment variable (server default)

Response (200 OK - Non-streaming):

{
  "message_id": "msg-uuid",
  "content": "The capital of Japan is Tokyo.",
  "usage": {
    "tokens_in": 20,
    "tokens_out": 10
  },
  "cost_usd": 0.00015,
  "model": "gpt-4o"
}

Response (200 OK - Streaming SSE):

When stream: true, returns Server-Sent Events:

event: message
data: {"chunk": "The", "id": "msg-123"}

event: message
data: {"chunk": " capital", "id": "msg-123"}

event: message
data: {"chunk": " of Japan is Tokyo.", "id": "msg-123"}

event: done
data: {"status": "completed", "usage": {"tokens_in": 20, "tokens_out": 10}, "cost_usd": 0.00015, "model": "gpt-4o"}

Event Types: - message - Content chunk - done - Stream completed with final stats - error - Error occurred during streaming


🧠 Memory

Search Memory

POST /v1/memory/search

Search user's long-term memory by semantic similarity. Uses the user's stored Switch.AI API key for embedding generation (set via PUT /v1/users/me/api-key).

Auth Required: ✅ Yes

Request Body:

{
  "query": "What is my favorite color?",
  "limit": 15,
  "offset": 0,
  "min_similarity": 0.6,
  "app_id": "default",
  "switch_ai_api_key": null
}

Field Type Required Default Description
query string - Search query text
app_id string "default" Application identifier
limit integer 15 Max results (1-50)
offset integer 0 Pagination offset
min_similarity float 0.6 Minimum similarity threshold (0.0-1.0)
switch_ai_api_key string null Override API key (uses stored key if not provided)

Response (200 OK):

{
  "results": [
    {
      "id": "mem-uuid-1",
      "content": "User's favorite color is blue",
      "similarity": 0.92,
      "created_at": "2025-12-01T10:00:00Z"
    }
  ],
  "total": 1,
  "limit": 15,
  "offset": 0
}

💡 API Key Fallback Chain: The search endpoint uses API keys in this priority order: 1. switch_ai_api_key in request body (highest priority) 2. User's stored key (set via PUT /v1/users/me/api-key) 3. SWITCH_AI_API_KEY environment variable (server default)


List User Memories

GET /v1/memory/me

Retrieve all memories for the authenticated user with pagination support. Memories are returned with importance scores and temporal decay applied.

Auth Required: ✅ Yes

Query Parameters:

Parameter Type Required Default Description
app_id string - Application identifier
limit integer 15 Number of results (1-100)
offset integer 0 Pagination offset

Request Example:

curl -X GET "https://api.traylinx.com/v1/memory/me?app_id=my-app&limit=20&offset=0" \
  -H "Authorization: Bearer ctx_your_token_here"

Response (200 OK):

{
  "memories": [
    {
      "id": "550e8400-e29b-41d4-a716-446655440000",
      "content": "User is allergic to peanuts",
      "importance_score": 0.9,
      "tags": ["medical", "allergy"],
      "created_at": "2025-12-01T10:00:00Z"
    },
    {
      "id": "660e8400-e29b-41d4-a716-446655440001",
      "content": "User prefers dark mode",
      "importance_score": 0.7,
      "tags": ["preference", "ui"],
      "created_at": "2025-12-02T14:30:00Z"
    }
  ],
  "total": 42,
  "limit": 20,
  "offset": 0
}

Importance Score Levels: - 0.9: Medical/Safety (allergies, medications, emergencies) - 0.85: Identity (name, personal identifiers) - 0.75: Summaries (conversation summaries) - 0.7: Preferences (likes, dislikes, wants) - 0.65: Decisions (plans, choices) - 0.5: Default (general information)


Delete Single Memory

DELETE /v1/memory/{memory_id}

Delete a specific memory by ID. Only the memory owner can delete it.

Auth Required: ✅ Yes

Path Parameters:

Parameter Type Required Description
memory_id UUID Memory identifier

Request Example:

curl -X DELETE "https://api.traylinx.com/v1/memory/550e8400-e29b-41d4-a716-446655440000" \
  -H "Authorization: Bearer ctx_your_token_here"

Response (200 OK):

{
  "success": true,
  "message": "Memory deleted successfully"
}

Error Responses:

Status Code Description
404 MEMORY_NOT_FOUND Memory doesn't exist or not owned by user
401 AUTHENTICATION_FAILED Invalid or missing token

Delete All User Memories (GDPR)

DELETE /v1/memory/user/all

Delete all memories for the authenticated user. This endpoint supports GDPR "right to be forgotten" compliance.

Auth Required: ✅ Yes

Query Parameters:

Parameter Type Required Description
app_id string Application identifier

Request Example:

curl -X DELETE "https://api.traylinx.com/v1/memory/user/all?app_id=my-app" \
  -H "Authorization: Bearer ctx_your_token_here"

Response (200 OK):

{
  "success": true,
  "deleted_count": 42
}

⚠️ Warning: This action is irreversible. All memories for the user in the specified app will be permanently deleted.


Find Duplicate Memories

GET /v1/memory/duplicates

Find groups of semantically similar (duplicate) memories. Useful for identifying redundant information before cleanup.

Auth Required: ✅ Yes

Query Parameters:

Parameter Type Required Description
app_id string Application identifier

Request Example:

curl -X GET "https://api.traylinx.com/v1/memory/duplicates?app_id=my-app" \
  -H "Authorization: Bearer ctx_your_token_here"

Response (200 OK):

{
  "groups": [
    {
      "canonical_id": "550e8400-e29b-41d4-a716-446655440000",
      "canonical_content": "User's name is Sebastian",
      "duplicates": [
        {
          "id": "660e8400-e29b-41d4-a716-446655440001",
          "content": "User is referred to as Sebastian",
          "created_at": "2025-12-02T14:30:00Z",
          "similarity_to_canonical": 0.92
        },
        {
          "id": "770e8400-e29b-41d4-a716-446655440002",
          "content": "User is called Sebastian",
          "created_at": "2025-12-03T10:00:00Z",
          "similarity_to_canonical": 0.89
        }
      ]
    }
  ],
  "total_duplicates": 2
}

Response Fields: - groups: Array of duplicate groups, each containing a canonical (oldest) memory and its duplicates - canonical_id: ID of the memory to keep (oldest in the group) - canonical_content: Content of the canonical memory - duplicates: Array of duplicate memories with similarity scores - total_duplicates: Total count of duplicate memories across all groups


Merge Duplicate Memories

POST /v1/memory/deduplicate

Merge duplicate memory groups by keeping the oldest (canonical) memory and deleting all newer duplicates.

Auth Required: ✅ Yes

Query Parameters:

Parameter Type Required Description
app_id string Application identifier

Request Example:

curl -X POST "https://api.traylinx.com/v1/memory/deduplicate?app_id=my-app" \
  -H "Authorization: Bearer ctx_your_token_here"

Response (200 OK):

{
  "success": true,
  "deleted_count": 15,
  "groups_merged": 5
}

Response Fields: - success: Whether the operation completed successfully - deleted_count: Total number of duplicate memories deleted - groups_merged: Number of duplicate groups that were merged

💡 Tip: Run GET /v1/memory/duplicates first to preview what will be merged before calling this endpoint.


🔄 Agent-to-Agent (A2A)

Create Conversation (A2A)

POST /a2a/conversation/create

Create a session using the A2A envelope protocol.

Request Body:

{
  "envelope": {
    "message_id": "msg-source-1",
    "sender_agent_key": "agent-client",
    "timestamp": "2025-12-02T12:00:00Z"
  },
  "user_id": "user-123",
  "app_id": "my-app"
}

Response (201 Created):

{
  "envelope": {
    "message_id": "msg-resp-1",
    "sender_agent_key": "traylinx-cortex",
    "timestamp": "2025-12-02T12:00:01Z",
    "in_reply_to": "msg-source-1"
  },
  "session_id": "...",
  "created_at": "..."
}


Chat (A2A)

POST /a2a/conversation/chat

Send a message using the A2A envelope protocol.

Request Body:

{
  "envelope": {
    "message_id": "msg-source-2",
    "sender_agent_key": "agent-client",
    "timestamp": "2025-12-02T12:00:10Z"
  },
  "action": "chat",
  "session_id": "...",
  "message": "Hello",
  "user_id": "user-123"
}


⚠️ Error Handling

All errors follow a standardized format:

{
  "error": {
    "code": "ERROR_CODE",
    "message": "Human readable message",
    "trace_id": "abc-123",
    "details": {}
  }
}

Common Error Codes

Code HTTP Status Description
SESSION_NOT_FOUND 404 Session ID does not exist
VALIDATION_ERROR 400 Invalid request parameters
AUTHENTICATION_FAILED 401 Invalid or missing token
TOKEN_BUDGET_EXCEEDED 400 Request exceeds token limits
LLM_SERVICE_ERROR 502 Upstream LLM provider failed
INTERNAL_ERROR 500 Unexpected server error

📋 Implementation Status

✅ Fully Implemented

Endpoint Method Description
/v1/users POST User registration
/v1/users/me GET Get current user
/v1/users/me/tokens POST Create token
/v1/users/me/tokens GET List tokens
/v1/users/me/tokens/{id} DELETE Revoke token
/v1/users/me/api-key PUT Update Switch.AI API key
/v1/users/me/sessions GET List user sessions
/v1/users/me/profile GET Get user profile
/v1/users/me/profile PATCH Update profile
/v1/users/me/profile DELETE Clear profile
/v1/session POST Create session
/v1/session/{id} GET Get session
/v1/session/{id} PATCH Update session
/v1/session/{id} DELETE Delete session
/v1/session/{id}/history GET Get history
/v1/chat POST Chat completion
/v1/memory/search POST Semantic memory search
/v1/memory/me GET List user memories
/v1/memory/{id} DELETE Delete single memory
/v1/memory/user/all DELETE Delete all user memories (GDPR)
/v1/memory/duplicates GET Find duplicate memory groups
/v1/memory/deduplicate POST Merge duplicate memories
/health GET Basic health
/health/live GET Detailed health
/ready GET Readiness check

🚧 Planned (Not Yet Implemented)

Endpoint Method Description
/v1/session/{id}/context GET Get assembled context
/v1/memory/consolidate POST Force memory consolidation

🔧 Configuration

Environment Variables

Variable Required Default Description
DATABASE_URL - PostgreSQL connection URL
REDIS_URL - Redis connection URL
ENCRYPTION_KEY - Fernet key for data encryption
LLM_BASE_URL Switch.AI LLM provider base URL
EMBEDDING_BASE_URL Switch.AI Embedding provider URL
SWITCH_AI_API_KEY - Default API key (fallback when no user key is set)
MODEL_FAST openai/gemini-2.5-flash Fast model with provider prefix
MODEL_BALANCED openai/llama-3.3-70b-versatile Balanced model with provider prefix
MODEL_POWERFUL openai/deepseek-r1-distill-llama-70b Powerful model with provider prefix
CELERY_BROKER_URL redis://redis:6379/1 Celery message broker URL
CELERY_RESULT_BACKEND redis://redis:6379/2 Celery task results store URL
MEMORY_TEMPORAL_DECAY 0.05 Memory decay rate per day (0.0-1.0)
MEMORY_MIN_IMPORTANCE 0.3 Minimum importance score to store memory (0.0-1.0)
MEMORY_MAX_AGE_DAYS 365 Maximum age in days for memories in search
MEMORY_DEDUP_SIMILARITY_THRESHOLD 0.85 Minimum similarity to consider memories as duplicates (0.0-1.0)
MEMORY_DEDUP_LLM_CHECK_ENABLED true Enable LLM-based semantic equivalence check for borderline cases
MEMORY_DEDUP_LLM_THRESHOLD_LOW 0.85 Lower bound similarity for LLM equivalence check
MEMORY_DEDUP_LLM_THRESHOLD_HIGH 0.95 Upper bound similarity (above this, skip LLM check)

⚠️ LiteLLM Provider Prefixes: Model names must include provider prefixes when using LiteLLM routing. Common prefixes: - openai/ - For OpenAI-compatible endpoints (including SwitchAI) - anthropic/ - For Anthropic models - google/ - For Google Gemini models - See LiteLLM docs for full list

Memory System Configuration

The memory system uses temporal decay and importance scoring to prioritize relevant information:

Temporal Decay (MEMORY_TEMPORAL_DECAY): - Controls how quickly memories lose relevance over time - Formula: weight = exp(-decay_rate × days_old) - Default 0.05 means memories retain ~22% weight after 30 days - Lower values = slower decay (memories stay relevant longer) - Higher values = faster decay (prioritize recent information)

Minimum Importance (MEMORY_MIN_IMPORTANCE): - Filters out low-importance memories during storage - Memories below this threshold are not saved - Default 0.3 filters trivial information while keeping preferences (0.7) and critical info (0.9)

Maximum Age (MEMORY_MAX_AGE_DAYS): - Excludes memories older than this from search results - Improves performance by reducing search space - Default 365 days (1 year)

Memory Deduplication Configuration

The memory system automatically deduplicates semantically similar facts during extraction to prevent storing redundant information like "User's name is Sebastian" and "User is called Sebastian" as separate memories.

Similarity Threshold (MEMORY_DEDUP_SIMILARITY_THRESHOLD): - Minimum cosine similarity to consider two memories as duplicates - Default 0.85 catches most semantic duplicates - Higher values = stricter matching (fewer duplicates detected) - Lower values = looser matching (more aggressive deduplication)

LLM Check Enabled (MEMORY_DEDUP_LLM_CHECK_ENABLED): - When true, uses LLM to verify semantic equivalence for borderline cases - Adds accuracy but increases latency and cost - Set to false to rely solely on embedding similarity

LLM Threshold Range (MEMORY_DEDUP_LLM_THRESHOLD_LOW / HIGH): - Defines the "borderline" similarity range where LLM check is triggered - Below LOW: Not a duplicate (store the memory) - Between LOW and HIGH: Ask LLM to verify equivalence - Above HIGH: Definite duplicate (skip without LLM check) - Default range [0.85, 0.95] balances accuracy and cost

Deduplication Flow:

New Fact → Normalize → Generate Embedding → Search Similar
                              similarity < 0.85 → STORE
                              similarity ≥ 0.95 → SKIP (duplicate)
                              0.85 ≤ similarity < 0.95 → LLM Check
                                              LLM says equivalent → SKIP
                                              LLM says different → STORE


**[← Quick Reference](././QUICK_REFERENCE.md)** | **[Docs Index](./README.md)** | **[Integration Guide →](././integration_guide.md)**