📚 Traylinx Cortex API Reference¶

Version: 2.2.0
Base URL: http://localhost:8000
Last Updated: 2025-12-04

📚 Navigation: Main README | Docs Index | Quick Reference | Integration Guide

🔐 Authentication¶

Traylinx Cortex uses a secure, token-based authentication system similar to GitHub Personal Access Tokens.

How It Works¶

Registration: Call POST /v1/users with your user details to receive an API token
Token Storage: Store the token securely (it's only shown once)
API Calls: Include the token in all subsequent requests via the Authorization header

Headers¶

Header	Required	Description
`Authorization`	✅ Yes (most endpoints)	`Bearer <token>` - Your API token
`X-Trace-ID`	❌ Optional	UUID for distributed tracing

Token Format¶

Tokens follow the format: ctx_<random_string> (e.g., ctx_abc123xyz...)

Security Features¶

Token Hashing: Tokens are hashed (SHA256) before storage - plaintext never stored
Encryption at Rest: Sensitive data (API keys) encrypted using Fernet
Ownership Isolation: Users can only access their own data

👤 User Management¶

Register User / Get Token¶

POST /v1/users

Creates a new user (or updates existing) and returns an API token. This is the entry point for new users.

Auth Required: ❌ No

Request Body:

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "email": "user@example.com",
  "first_name": "John",
  "last_name": "Doe",
  "switch_ai_api_key": "sk-your-api-key",
  "custom_attributes": {
    "company": "Acme Inc"
  },
  "token_name": "My Laptop"
}

Field	Type	Required	Description
`id`	UUID	✅	User ID (from your auth system)
`email`	string	✅	User email address
`first_name`	string	❌	First name
`last_name`	string	❌	Last name
`switch_ai_api_key`	string	❌	Switch.AI API key (encrypted at rest)
`custom_attributes`	object	❌	Additional user metadata
`token_name`	string	❌	Name for the token (default: "Initial Token")

Response (201 Created):

{
  "access_token": "ctx_abc123xyz...",
  "token_type": "Bearer"
}

⚠️ Important: Save this token immediately! It's only shown once.

Get Current User¶

GET /v1/users/me

Returns details of the authenticated user.

Auth Required: ✅ Yes

Response (200 OK):

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "email": "user@example.com",
  "first_name": "John",
  "last_name": "Doe",
  "is_active": true,
  "created_at": "2025-12-02T10:00:00Z",
  "has_api_key": true,
  "api_key_preview": "sk-...abc123"
}

Field	Description
`has_api_key`	Whether user has a Switch.AI API key stored
`api_key_preview`	Masked preview of the key (null if no key set)

Update API Key¶

PUT /v1/users/me/api-key

Update or clear the user's Switch.AI API key.

Auth Required: ✅ Yes

Request Body:

{
  "api_key": "sk-new-api-key"
}

Response (200 OK):

{
  "message": "API key updated successfully",
  "has_api_key": true,
  "api_key_preview": "sk-...abc123"
}

Create API Token¶

POST /v1/users/me/tokens

Generate a new API token for the authenticated user. Useful for multiple devices or applications.

Auth Required: ✅ Yes

Request Body:

{
  "name": "Chrome on MacBook"
}

Response (201 Created):

{
  "access_token": "ctx_newtoken123...",
  "token_type": "Bearer"
}

List API Tokens¶

GET /v1/users/me/tokens

List all active tokens for the authenticated user. Token values are masked for security.

Auth Required: ✅ Yes

Response (200 OK):

{
  "tokens": [
    {
      "id": "token-uuid-1",
      "name": "My Laptop",
      "token_prefix": "ctx_abc123...",
      "last_used_at": "2025-12-02T12:00:00Z",
      "created_at": "2025-12-01T10:00:00Z"
    },
    {
      "id": "token-uuid-2",
      "name": "Mobile App",
      "token_prefix": "ctx_xyz789...",
      "last_used_at": null,
      "created_at": "2025-12-02T09:00:00Z"
    }
  ]
}

Revoke API Token¶

DELETE /v1/users/me/tokens/{token_id}

Revoke (delete) a specific API token. The token becomes immediately invalid.

Auth Required: ✅ Yes

Response (204 No Content)

Error Response (404 Not Found):

{
  "detail": "Token not found"
}

List Sessions¶

GET /v1/users/me/sessions

List all sessions for the current user with pagination.

Auth Required: ✅ Yes

Query Parameters: - limit: int (default 20) - offset: int (default 0) - search: string (optional, filter by title)

Response (200 OK):

{
  "sessions": [
    {
      "id": "session-uuid-1",
      "title": "Trip to Tokyo",
      "app_id": "my-app",
      "created_at": "2025-12-02T12:00:00Z",
      "updated_at": "2025-12-02T12:05:00Z",
      "message_count": 5
    }
  ],
  "total": 1,
  "limit": 20,
  "offset": 0
}

Get User Profile¶

GET /v1/users/me/profile

Get user profile facts.

Auth Required: ✅ Yes

Query Parameters: - app_id: string (default "default")

Response (200 OK):

{
  "user_id": "user-uuid",
  "app_id": "default",
  "facts": {
    "name": "John Doe",
    "preferences": "dark mode"
  },
  "updated_at": "2025-12-02T12:00:00Z"
}

Update User Profile¶

PATCH /v1/users/me/profile

Update user profile facts (merge with existing).

Auth Required: ✅ Yes

Request Body:

{
  "facts": {
    "location": "New York"
  }
}

Response (200 OK):

{
  "user_id": "user-uuid",
  "app_id": "default",
  "facts": {
    "name": "John Doe",
    "preferences": "dark mode",
    "location": "New York"
  },
  "updated_at": "2025-12-02T12:10:00Z"
}

Delete User Profile¶

DELETE /v1/users/me/profile

Delete user profile facts.

Auth Required: ✅ Yes

Query Parameters: - app_id: string (default "default")

Response (204 No Content)

Extract Facts from Content¶

POST /v1/users/me/profile/extract

Extract structured user facts from unstructured content using AI. This endpoint is generic and app-agnostic - any application (web, mobile, etc.) can send content in various formats.

Auth Required: ✅ Yes

Query Parameters: - app_id: string (default "default")

Request Body:

Accepts any combination of the following fields:

Field	Type	Description
`text`	string	Natural language text (e.g., "My name is Sebastian, I live in Berlin")
`data`	object	Structured JSON object with any fields (nested objects are flattened)
`raw`	string	Any raw content to analyze

Example Requests:

From natural language:

{
  "text": "I'm a software developer living in Madrid, Spain. I have a dog named Olivia."
}

From structured app data:

{
  "data": {
    "firstName": "John",
    "lastName": "Doe",
    "email": "john@example.com",
    "address": {
      "city": "Berlin",
      "country": "Germany"
    }
  }
}

From raw content:

{
  "raw": "User profile: name=Sebastian, location=Spain, job=developer"
}

Combined (all formats):

{
  "text": "I love hiking and photography",
  "data": {"hobby": "travel"},
  "raw": "favorite_color=blue"
}

Response (200 OK):

{
  "facts": {
    "firstName": "John",
    "lastName": "Doe",
    "email": "john@example.com",
    "city": "Berlin",
    "country": "Germany",
    "hobby": "hiking"
  }
}

LLM Prompt Used:

The endpoint uses this prompt to extract facts:

You are a fact extraction system. Extract user profile facts from the following content.

Return ONLY a valid JSON object with key-value pairs representing facts about the person.

Rules:
- Keys should be camelCase (e.g., "firstName", "dogName", "favoriteColor", "workLocation")
- Keep keys short, clear, and descriptive
- Values should be the extracted information as strings
- Only extract factual, personal information about the user
- Normalize similar fields (e.g., "first_name", "firstName", "name" -> use "name" or "firstName")
- For addresses, extract as separate fields: street, city, state, country, postalCode
- For dates, keep in ISO format if possible (YYYY-MM-DD)
- If no facts can be extracted, return an empty object {}
- Do NOT include sensitive data like passwords

Content to analyze:
{combined_content}

Return only the JSON object, no explanation or markdown:

Use Cases: - Import user profile from authentication systems - Parse natural language chat messages - Process form data from mobile apps - Extract facts from any structured or unstructured data

Error Handling (LLM Response Sanitization):

LLMs can return malformed responses. The endpoint uses a robust JSON sanitizer that handles:

Issue	How It's Handled
Markdown code blocks	Strips ```json ... ``` wrappers
Mixed text + JSON	Extracts JSON object from surrounding text
Trailing commas	Removes `{"key": "value",}` → `{"key": "value"}`
Single quotes	Converts Python-style `'key'` to `"key"`
Unquoted keys	Converts `{key: "value"}` to `{"key": "value"}`
Python booleans	Converts `True/False/None` to `true/false/null`
Truncated JSON	Attempts to close unclosed braces
Empty/whitespace	Returns empty facts object `{}`
Complete failure	Fallback to regex key-value extraction

Validation Applied: - Empty/null values are removed - Placeholder values (unknown, N/A) are removed - Keys limited to 64 characters - Values limited to 1000 characters - Invalid keys converted to camelCase

The endpoint never fails with 500 for LLM parsing issues - it gracefully returns {"facts": {}} if extraction fails completely.

🩺 Health & Observability¶

Get Basic Health¶

GET /health

Returns the basic status of the service.

Auth Required: ❌ No

Response (200 OK):

{
  "status": "ok"
}

Get Liveness Status¶

GET /health/live

Detailed check of all dependencies (PostgreSQL, Redis, LLM).

Auth Required: ❌ No

Response (200 OK):

{
  "status": "healthy",
  "checks": {
    "postgres": {
      "status": "healthy",
      "latency_ms": 1.2
    },
    "redis": {
      "status": "healthy",
      "latency_ms": 0.5
    },
    "llm": {
      "status": "healthy"
    }
  },
  "timestamp": "2025-12-02T12:00:00Z"
}

Possible Status Values: - healthy - All systems operational - degraded - Some systems have issues but service is functional - unhealthy - Critical systems are down

Get Readiness Status¶

GET /ready

Kubernetes readiness probe. Returns 200 if ready to serve traffic, 503 otherwise.

Auth Required: ❌ No

Response (200 OK):

{
  "ready": true,
  "message": "Service is ready"
}

Response (503 Service Unavailable):

{
  "ready": false,
  "message": "Service is not ready"
}

💬 Sessions¶

Create Session¶

POST /v1/session

Initialize a new conversation session.

Auth Required: ✅ Yes

Request Body:

{
  "app_id": "my-app",
  "metadata": {
    "source": "web",
    "language": "en"
  }
}

Response (201 Created):

{
  "session_id": "550e8400-e29b-41d4-a716-446655440000",
  "created_at": "2025-12-02T12:00:00Z"
}

Get Session Details¶

GET /v1/session/{session_id}

Retrieve details of a specific session.

Auth Required: ✅ Yes

Response (200 OK):

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "user_id": "user-123",
  "app_id": "my-app",
  "title": "Trip to Tokyo",
  "created_at": "2025-12-02T12:00:00Z",
  "updated_at": "2025-12-02T12:05:00Z",
  "metadata": {
    "source": "web"
  }
}

Update Session¶

PATCH /v1/session/{session_id}

Update session title or metadata.

Auth Required: ✅ Yes

Request Body:

{
  "title": "New Title",
  "metadata": {
    "new_field": "value"
  }
}

Response (200 OK):

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "title": "New Title",
  ...
}

Get Session History¶

GET /v1/session/{session_id}/history

Retrieve full message history for a session in chronological order.

Auth Required: ✅ Yes

Response (200 OK):

{
  "session_id": "550e8400-e29b-41d4-a716-446655440000",
  "messages": [
    {
      "id": "msg-uuid-1",
      "role": "user",
      "content": "Hello",
      "token_count": 2,
      "created_at": "2025-12-02T12:00:00Z"
    },
    {
      "id": "msg-uuid-2",
      "role": "assistant",
      "content": "Hi there! How can I help you today?",
      "token_count": 10,
      "created_at": "2025-12-02T12:00:01Z"
    }
  ]
}

Delete Session¶

DELETE /v1/session/{session_id}

Clears Short-Term Memory (STM) for the session. Long-Term Memory (LTM) is preserved.

Auth Required: ✅ Yes

Response (204 No Content)

🤖 Chat¶

Send Message¶

POST /v1/chat

Send a message to the AI agent. Supports both standard and streaming responses.

Auth Required: ✅ Yes

Request Body:

{
  "session_id": "550e8400-e29b-41d4-a716-446655440000",
  "message": "What is the capital of Japan?",
  "config": {
    "stream": false,
    "model_preference": "balanced",
    "switch_ai_api_key": null,
    "embedding_model": null
  }
}

Config Options:

Field	Type	Default	Description
`stream`	boolean	`false`	Enable SSE streaming
`model_preference`	string	`"balanced"`	`"fast"`, `"balanced"`, or `"powerful"`
`switch_ai_api_key`	string	`null`	Override API key (uses stored key if not provided)
`embedding_model`	string	`null`	Embedding model (e.g., `mistral-embed`)

💡 API Key Fallback Chain: The chat endpoint uses API keys in this priority order: 1. switch_ai_api_key in config (highest priority) 2. User's stored key (set via PUT /v1/users/me/api-key) 3. SWITCH_AI_API_KEY environment variable (server default)

Response (200 OK - Non-streaming):

{
  "message_id": "msg-uuid",
  "content": "The capital of Japan is Tokyo.",
  "usage": {
    "tokens_in": 20,
    "tokens_out": 10
  },
  "cost_usd": 0.00015,
  "model": "gpt-4o"
}

Response (200 OK - Streaming SSE):

When stream: true, returns Server-Sent Events:

event: message
data: {"chunk": "The", "id": "msg-123"}

event: message
data: {"chunk": " capital", "id": "msg-123"}

event: message
data: {"chunk": " of Japan is Tokyo.", "id": "msg-123"}

event: done
data: {"status": "completed", "usage": {"tokens_in": 20, "tokens_out": 10}, "cost_usd": 0.00015, "model": "gpt-4o"}

Event Types: - message - Content chunk - done - Stream completed with final stats - error - Error occurred during streaming

🧠 Memory¶

Search Memory¶

POST /v1/memory/search

Search user's long-term memory by semantic similarity. Uses the user's stored Switch.AI API key for embedding generation (set via PUT /v1/users/me/api-key).

Auth Required: ✅ Yes

Request Body:

{
  "query": "What is my favorite color?",
  "limit": 15,
  "offset": 0,
  "min_similarity": 0.6,
  "app_id": "default",
  "switch_ai_api_key": null
}

Field	Type	Required	Default	Description
`query`	string	✅	-	Search query text
`app_id`	string	❌	`"default"`	Application identifier
`limit`	integer	❌	15	Max results (1-50)
`offset`	integer	❌	0	Pagination offset
`min_similarity`	float	❌	0.6	Minimum similarity threshold (0.0-1.0)
`switch_ai_api_key`	string	❌	`null`	Override API key (uses stored key if not provided)

Response (200 OK):

{
  "results": [
    {
      "id": "mem-uuid-1",
      "content": "User's favorite color is blue",
      "similarity": 0.92,
      "created_at": "2025-12-01T10:00:00Z"
    }
  ],
  "total": 1,
  "limit": 15,
  "offset": 0
}

💡 API Key Fallback Chain: The search endpoint uses API keys in this priority order: 1. switch_ai_api_key in request body (highest priority) 2. User's stored key (set via PUT /v1/users/me/api-key) 3. SWITCH_AI_API_KEY environment variable (server default)

List User Memories¶

GET /v1/memory/me

Retrieve all memories for the authenticated user with pagination support. Memories are returned with importance scores and temporal decay applied.

Auth Required: ✅ Yes

Query Parameters:

Parameter	Type	Required	Default	Description
`app_id`	string	✅	-	Application identifier
`limit`	integer	❌	15	Number of results (1-100)
`offset`	integer	❌	0	Pagination offset

Request Example:

curl -X GET "https://api.traylinx.com/v1/memory/me?app_id=my-app&limit=20&offset=0" \
  -H "Authorization: Bearer ctx_your_token_here"

Response (200 OK):

{
  "memories": [
    {
      "id": "550e8400-e29b-41d4-a716-446655440000",
      "content": "User is allergic to peanuts",
      "importance_score": 0.9,
      "tags": ["medical", "allergy"],
      "created_at": "2025-12-01T10:00:00Z"
    },
    {
      "id": "660e8400-e29b-41d4-a716-446655440001",
      "content": "User prefers dark mode",
      "importance_score": 0.7,
      "tags": ["preference", "ui"],
      "created_at": "2025-12-02T14:30:00Z"
    }
  ],
  "total": 42,
  "limit": 20,
  "offset": 0
}

Importance Score Levels: - 0.9: Medical/Safety (allergies, medications, emergencies) - 0.85: Identity (name, personal identifiers) - 0.75: Summaries (conversation summaries) - 0.7: Preferences (likes, dislikes, wants) - 0.65: Decisions (plans, choices) - 0.5: Default (general information)

Delete Single Memory¶

DELETE /v1/memory/{memory_id}

Delete a specific memory by ID. Only the memory owner can delete it.

Auth Required: ✅ Yes

Path Parameters:

Parameter	Type	Required	Description
`memory_id`	UUID	✅	Memory identifier

Request Example:

curl -X DELETE "https://api.traylinx.com/v1/memory/550e8400-e29b-41d4-a716-446655440000" \
  -H "Authorization: Bearer ctx_your_token_here"

Response (200 OK):

{
  "success": true,
  "message": "Memory deleted successfully"
}

Error Responses:

Status	Code	Description
404	`MEMORY_NOT_FOUND`	Memory doesn't exist or not owned by user
401	`AUTHENTICATION_FAILED`	Invalid or missing token

DELETE /v1/memory/user/all

Delete all memories for the authenticated user. This endpoint supports GDPR "right to be forgotten" compliance.

Auth Required: ✅ Yes

Query Parameters:

Parameter	Type	Required	Description
`app_id`	string	✅	Application identifier

Request Example:

curl -X DELETE "https://api.traylinx.com/v1/memory/user/all?app_id=my-app" \
  -H "Authorization: Bearer ctx_your_token_here"

Response (200 OK):

{
  "success": true,
  "deleted_count": 42
}

⚠️ Warning: This action is irreversible. All memories for the user in the specified app will be permanently deleted.

Find Duplicate Memories¶

GET /v1/memory/duplicates

Find groups of semantically similar (duplicate) memories. Useful for identifying redundant information before cleanup.

Auth Required: ✅ Yes

Query Parameters:

Parameter	Type	Required	Description
`app_id`	string	✅	Application identifier

Request Example:

curl -X GET "https://api.traylinx.com/v1/memory/duplicates?app_id=my-app" \
  -H "Authorization: Bearer ctx_your_token_here"

Response (200 OK):

href="#__codelineno-41-1">{ "groups": [ { "canonical_id": "550e8400-e29b-41d4-a716-446655440000", "canonical_content": "User's name is Sebastian", "duplicates": [ { "id": "660e8400-e29b-41d4-a716-446655440001", "content": "User is referred to as Sebastian", "created_at": "2025-12-02T14:30:00Z", "similarity_to_canonical": 0.92 }, { "id": "770e8400-e29b-41d4-a716-446655440002", "content": "User is called Sebastian", "created_at": "2025-12-03T10:00:00Z", "similarity_to_canonical": 0.89 } ] } ], "total_duplicates": 2 }

Response Fields: - groups: Array of duplicate groups, each containing a canonical (oldest) memory and its duplicates - canonical_id: ID of the memory to keep (oldest in the group) - canonical_content: Content of the canonical memory - duplicates: Array of duplicate memories with similarity scores - total_duplicates: Total count of duplicate memories across all groups

Merge Duplicate Memories¶

POST /v1/memory/deduplicate

Merge duplicate memory groups by keeping the oldest (canonical) memory and deleting all newer duplicates.

Auth Required: ✅ Yes

Query Parameters:

Parameter	Type	Required	Description
`app_id`	string	✅	Application identifier

Request Example:

curl -X POST "https://api.traylinx.com/v1/memory/deduplicate?app_id=my-app" \
  -H "Authorization: Bearer ctx_your_token_here"

Response (200 OK):

{
  "success": true,
  "deleted_count": 15,
  "groups_merged": 5
}

Response Fields: - success: Whether the operation completed successfully - deleted_count: Total number of duplicate memories deleted - groups_merged: Number of duplicate groups that were merged

💡 Tip: Run GET /v1/memory/duplicates first to preview what will be merged before calling this endpoint.

🔄 Agent-to-Agent (A2A)¶

Create Conversation (A2A)¶

POST /a2a/conversation/create

Create a session using the A2A envelope protocol.

Request Body:

{
  "envelope": {
    "message_id": "msg-source-1",
    "sender_agent_key": "agent-client",
    "timestamp": "2025-12-02T12:00:00Z"
  },
  "user_id": "user-123",
  "app_id": "my-app"
}

Response (201 Created):

{
  "envelope": {
    "message_id": "msg-resp-1",
    "sender_agent_key": "traylinx-cortex",
    "timestamp": "2025-12-02T12:00:01Z",
    "in_reply_to": "msg-source-1"
  },
  "session_id": "...",
  "created_at": "..."
}

Chat (A2A)¶

POST /a2a/conversation/chat

Send a message using the A2A envelope protocol.

Request Body:

{
  "envelope": {
    "message_id": "msg-source-2",
    "sender_agent_key": "agent-client",
    "timestamp": "2025-12-02T12:00:10Z"
  },
  "action": "chat",
  "session_id": "...",
  "message": "Hello",
  "user_id": "user-123"
}

⚠️ Error Handling¶

All errors follow a standardized format:

{
  "error": {
    "code": "ERROR_CODE",
    "message": "Human readable message",
    "trace_id": "abc-123",
    "details": {}
  }
}

Common Error Codes¶

Code	HTTP Status	Description
`SESSION_NOT_FOUND`	404	Session ID does not exist
`VALIDATION_ERROR`	400	Invalid request parameters
`AUTHENTICATION_FAILED`	401	Invalid or missing token
`TOKEN_BUDGET_EXCEEDED`	400	Request exceeds token limits
`LLM_SERVICE_ERROR`	502	Upstream LLM provider failed
`INTERNAL_ERROR`	500	Unexpected server error

📋 Implementation Status¶

✅ Fully Implemented¶

Endpoint	Method	Description
`/v1/users`	POST	User registration
`/v1/users/me`	GET	Get current user
`/v1/users/me/tokens`	POST	Create token
`/v1/users/me/tokens`	GET	List tokens
`/v1/users/me/tokens/{id}`	DELETE	Revoke token
`/v1/users/me/api-key`	PUT	Update Switch.AI API key
`/v1/users/me/sessions`	GET	List user sessions
`/v1/users/me/profile`	GET	Get user profile
`/v1/users/me/profile`	PATCH	Update profile
`/v1/users/me/profile`	DELETE	Clear profile
`/v1/session`	POST	Create session
`/v1/session/{id}`	GET	Get session
`/v1/session/{id}`	PATCH	Update session
`/v1/session/{id}`	DELETE	Delete session
`/v1/session/{id}/history`	GET	Get history
`/v1/chat`	POST	Chat completion
`/v1/memory/search`	POST	Semantic memory search
`/v1/memory/me`	GET	List user memories
`/v1/memory/{id}`	DELETE	Delete single memory
`/v1/memory/user/all`	DELETE	Delete all user memories (GDPR)
`/v1/memory/duplicates`	GET	Find duplicate memory groups
`/v1/memory/deduplicate`	POST	Merge duplicate memories
`/health`	GET	Basic health
`/health/live`	GET	Detailed health
`/ready`	GET	Readiness check

🚧 Planned (Not Yet Implemented)¶

Endpoint	Method	Description
`/v1/session/{id}/context`	GET	Get assembled context
`/v1/memory/consolidate`	POST	Force memory consolidation

🔧 Configuration¶

Environment Variables¶

Variable	Required	Default	Description
`DATABASE_URL`	✅	-	PostgreSQL connection URL
`REDIS_URL`	✅	-	Redis connection URL
`ENCRYPTION_KEY`	✅	-	Fernet key for data encryption
`LLM_BASE_URL`	❌	Switch.AI	LLM provider base URL
`EMBEDDING_BASE_URL`	❌	Switch.AI	Embedding provider URL
`SWITCH_AI_API_KEY`	❌	-	Default API key (fallback when no user key is set)
`MODEL_FAST`	❌	`openai/gemini-2.5-flash`	Fast model with provider prefix
`MODEL_BALANCED`	❌	`openai/llama-3.3-70b-versatile`	Balanced model with provider prefix
`MODEL_POWERFUL`	❌	`openai/deepseek-r1-distill-llama-70b`	Powerful model with provider prefix
`CELERY_BROKER_URL`	❌	`redis://redis:6379/1`	Celery message broker URL
`CELERY_RESULT_BACKEND`	❌	`redis://redis:6379/2`	Celery task results store URL
`MEMORY_TEMPORAL_DECAY`	❌	`0.05`	Memory decay rate per day (0.0-1.0)
`MEMORY_MIN_IMPORTANCE`	❌	`0.3`	Minimum importance score to store memory (0.0-1.0)
`MEMORY_MAX_AGE_DAYS`	❌	`365`	Maximum age in days for memories in search
`MEMORY_DEDUP_SIMILARITY_THRESHOLD`	❌	`0.85`	Minimum similarity to consider memories as duplicates (0.0-1.0)
`MEMORY_DEDUP_LLM_CHECK_ENABLED`	❌	`true`	Enable LLM-based semantic equivalence check for borderline cases
`MEMORY_DEDUP_LLM_THRESHOLD_LOW`	❌	`0.85`	Lower bound similarity for LLM equivalence check
`MEMORY_DEDUP_LLM_THRESHOLD_HIGH`	❌	`0.95`	Upper bound similarity (above this, skip LLM check)

⚠️ LiteLLM Provider Prefixes: Model names must include provider prefixes when using LiteLLM routing. Common prefixes: - openai/ - For OpenAI-compatible endpoints (including SwitchAI) - anthropic/ - For Anthropic models - google/ - For Google Gemini models - See LiteLLM docs for full list

Memory System Configuration¶

The memory system uses temporal decay and importance scoring to prioritize relevant information:

Temporal Decay (MEMORY_TEMPORAL_DECAY): - Controls how quickly memories lose relevance over time - Formula: weight = exp(-decay_rate × days_old) - Default 0.05 means memories retain ~22% weight after 30 days - Lower values = slower decay (memories stay relevant longer) - Higher values = faster decay (prioritize recent information)

Minimum Importance (MEMORY_MIN_IMPORTANCE): - Filters out low-importance memories during storage - Memories below this threshold are not saved - Default 0.3 filters trivial information while keeping preferences (0.7) and critical info (0.9)

Maximum Age (MEMORY_MAX_AGE_DAYS): - Excludes memories older than this from search results - Improves performance by reducing search space - Default 365 days (1 year)

Memory Deduplication Configuration¶

The memory system automatically deduplicates semantically similar facts during extraction to prevent storing redundant information like "User's name is Sebastian" and "User is called Sebastian" as separate memories.

Similarity Threshold (MEMORY_DEDUP_SIMILARITY_THRESHOLD): - Minimum cosine similarity to consider two memories as duplicates - Default 0.85 catches most semantic duplicates - Higher values = stricter matching (fewer duplicates detected) - Lower values = looser matching (more aggressive deduplication)

LLM Check Enabled (MEMORY_DEDUP_LLM_CHECK_ENABLED): - When true, uses LLM to verify semantic equivalence for borderline cases - Adds accuracy but increases latency and cost - Set to false to rely solely on embedding similarity

LLM Threshold Range (MEMORY_DEDUP_LLM_THRESHOLD_LOW / HIGH): - Defines the "borderline" similarity range where LLM check is triggered - Below LOW: Not a duplicate (store the memory) - Between LOW and HIGH: Ask LLM to verify equivalence - Above HIGH: Definite duplicate (skip without LLM check) - Default range [0.85, 0.95] balances accuracy and cost

Deduplication Flow:

New Fact → Normalize → Generate Embedding → Search Similar
                                              ↓
                              similarity < 0.85 → STORE
                              similarity ≥ 0.95 → SKIP (duplicate)
                              0.85 ≤ similarity < 0.95 → LLM Check
                                                          ↓
                                              LLM says equivalent → SKIP
                                              LLM says different → STORE

**[← Quick Reference](././QUICK_REFERENCE.md)** | **[Docs Index](./README.md)** | **[Integration Guide →](././integration_guide.md)**

📚 Traylinx Cortex API Reference¶

🔐 Authentication¶

How It Works¶

Headers¶

Token Format¶

Security Features¶

👤 User Management¶

Register User / Get Token¶

Get Current User¶

Update API Key¶

Create API Token¶

List API Tokens¶

Revoke API Token¶

List Sessions¶

Get User Profile¶

Update User Profile¶

Delete User Profile¶

Extract Facts from Content¶

🩺 Health & Observability¶

Get Basic Health¶

Get Liveness Status¶

Get Readiness Status¶

💬 Sessions¶

Create Session¶

Get Session Details¶

Update Session¶

Get Session History¶

Delete Session¶

🤖 Chat¶

Send Message¶

🧠 Memory¶

Search Memory¶

List User Memories¶

Delete Single Memory¶

Delete All User Memories (GDPR)¶

Find Duplicate Memories¶

Merge Duplicate Memories¶

🔄 Agent-to-Agent (A2A)¶

Create Conversation (A2A)¶

Chat (A2A)¶

⚠️ Error Handling¶

Common Error Codes¶

📋 Implementation Status¶

✅ Fully Implemented¶

🚧 Planned (Not Yet Implemented)¶

🔧 Configuration¶

Environment Variables¶

Memory System Configuration¶

Memory Deduplication Configuration¶