Skip to content
Traylinx Cortex Logo

🧠 Traylinx Cortex

**The Cognitive Core of the Traylinx Ecosystem** A universal, plug-and-play "Brain Module" that provides unified memory, context management, and intelligent LLM routing for any application. **Version:** 2.2.0 | **Tests:** 461 passing | **Coverage:** 75% [![Production Ready](https://img.shields.io/badge/Status-Production%20Ready-brightgreen)](/) [![Tests](https://img.shields.io/badge/Tests-461%20passing-success)](/) [![Coverage](https://img.shields.io/badge/Coverage-75%25-blue)](/)

Documentation NavigationΒΆ

Document Description Audience
πŸ“‹ Quick Reference One-page developer cheat sheet Developers
πŸ“– API Reference Complete endpoint documentation Developers
πŸ—„οΈ Database Schema ERD diagrams and data flow Developers/DBAs
πŸ”Œ Integration Guide Connect via Sentinel & A2A Integrators
πŸ—οΈ Implementation Plan Step-by-step build guide Contributors

What is Traylinx Cortex?ΒΆ

Traylinx Cortex is the intelligent middleware between your user interface and raw LLM APIs. It manages the entire lifecycle of a conversation, ensuring that your AI agents are not just "chatbots" but context-aware partners with persistent memory.

Core CapabilitiesΒΆ

Feature Description
🧩 Unified Memory Seamlessly integrates Short-Term (Redis) and Long-Term (PostgreSQL + pgvector) memory
πŸ”„ LLM Agnostic Dynamic routing via LiteLLM - configurable base URLs + per-user API keys for any provider
πŸ”Œ Drop-in Integration RESTful API for any Web, Mobile, or IoT application
πŸ”’ Enterprise Security Built-in PII scrubbing, multi-tenancy, and Traylinx Sentinel authentication
πŸ“Š Full Observability LangSmith/OpenTelemetry tracing, structured logging, cost tracking

ArchitectureΒΆ

graph TB subgraph "Client Layer" WEB[Web App] MOB[Mobile App] A2A[A2A Agents] end subgraph "API Layer" GW[FastAPI Gateway] AUTH[Sentinel Auth] end subgraph "Core Engine" ORCH[LangGraph Orchestrator] LLM[LiteLLM Router] PII[PII Scrubber] end subgraph "Memory Layer" STM[Redis STM] LTM[PostgreSQL + pgvector LTM] EMB[Embeddings Service] end subgraph "Background" CELERY[Celery Workers] CONSOL[Memory Consolidation] end WEB --> GW MOB --> GW A2A --> GW GW --> AUTH AUTH --> ORCH ORCH --> PII ORCH --> STM ORCH --> LTM ORCH --> LLM LTM --> EMB CELERY --> CONSOL CONSOL --> LTM

Technology Stack:

Layer Technology Purpose
API FastAPI 0.115+ High-performance async REST API
Orchestration LangGraph 0.2+ State machine for conversation flow
LLM Routing LiteLLM 1.50+ Multi-provider abstraction (OpenAI, Anthropic, SwitchAI)
STM Redis 7 (or Valkey) Fast session cache
LTM PostgreSQL 16 + pgvector Persistent memory with vectors
Background Celery 5.4+ Async task processing
PII Microsoft Presidio Privacy protection
Auth Traylinx Sentinel A2A authentication

Quick StartΒΆ

PrerequisitesΒΆ

  • Python 3.11+
  • Poetry
  • Docker & Docker Compose

InstallationΒΆ

# Clone and navigate
cd traylinx_cortex

# Install dependencies
poetry install

# Setup environment
cp .env.example .env
# Edit .env with your credentials
# IMPORTANT: Model names must include provider prefixes (e.g., openai/...)

# Start infrastructure
docker-compose up -d

# Run migrations
poetry run alembic upgrade head

# Start service
poetry run uvicorn app.main:app --reload

First RequestΒΆ

# Create a session
curl -X POST http://localhost:8000/v1/session \
  -H "Content-Type: application/json" \
  -d '{"user_id": "user_123", "app_id": "my_app"}'

# Response: {"session_id": "abc-123-def", "created_at": "..."}

# Chat with memory
curl -X POST http://localhost:8000/v1/chat \
  -H "Content-Type: application/json" \
  -d '{
    "session_id": "abc-123-def",
    "message": "Hello, my name is Sebastian and I love pizza",
    "user_id": "user_123"
  }'

# Stream responses (SSE)
curl -N http://localhost:8000/v1/chat \
  -H "Content-Type: application/json" \
  -d '{
    "session_id": "abc-123-def",
    "message": "What is my name?",
    "user_id": "user_123",
    "config": {"stream": true}
  }'

# Use per-user API key and embedding model
curl -X POST http://localhost:8000/v1/chat \
  -H "Content-Type: application/json" \
  -d '{
    "session_id": "abc-123-def",
    "message": "Hello!",
    "user_id": "user_123",
    "config": {
      "switch_ai_api_key": "sk-user-specific-key",
      "embedding_model": "mistral-embed"
    }
  }'

πŸ“– More examples: See API Reference for complete endpoint documentation.


API ReferenceΒΆ

SessionsΒΆ

Endpoint Method Description
/v1/session POST Create new conversation session
/v1/session/{id} GET Get session details
/v1/session/{id} DELETE End session
/v1/session/{id}/messages GET List session messages

ChatΒΆ

Endpoint Method Description
/v1/chat POST Send message and get response
/v1/chat POST (stream=true) Stream response via SSE

A2A (Agent-to-Agent)ΒΆ

Endpoint Method Description
/a2a/conversation POST A2A envelope protocol

MemoryΒΆ

Endpoint Method Description
/v1/memory/search POST Semantic memory search
/v1/memory/me GET List user memories
/v1/memory/{id} DELETE Delete single memory
/v1/memory/duplicates GET Find duplicate memory groups
/v1/memory/deduplicate POST Merge duplicate memories

HealthΒΆ

Endpoint Method Description
/health GET Basic health check
/health/ready GET Readiness probe
/health/live GET Liveness probe

πŸ“– Full documentation: OpenAPI docs available at /docs when running.


AuthenticationΒΆ

Cortex supports two authentication methods:

Option 1: User Authentication (Human API)ΒΆ

For human users accessing via UI or apps:

curl -X POST "http://localhost:8000/v1/chat" \
  -H "Authorization: Bearer <user_token>" \
  -H "Content-Type: application/json" \
  -d '{"session_id": "abc-123", "message": "Hello"}'
Header Value Description
Authorization Bearer <token> User token validated against the database

Option 2: Agent Authentication (A2A / Machine-to-Machine)ΒΆ

For other agents or services calling Cortex via Traylinx Sentinel:

curl -X POST "http://localhost:8000/v1/chat" \
  -H "X-Agent-Secret-Token: <agent_token>" \
  -H "X-Agent-User-Id: <agent_id>" \
  -H "Content-Type: application/json" \
  -d '{"session_id": "abc-123", "message": "Hello"}'
Header Value Description
X-Agent-Secret-Token Agent's secret token Obtained from Traylinx Sentinel
X-Agent-User-Id Agent's UUID Agent identifier registered with Sentinel

Authentication FlowΒΆ

Request arrives
    β”‚
    β”œβ”€ Has X-Agent-Secret-Token? ──→ AGENT MODE (validate via Sentinel)
    β”‚
    └─ Has Authorization: Bearer? ─→ USER MODE (validate via Database)

Note: Agent authentication headers take precedence over Bearer token if both are provided.

Using Dual-Auth in CodeΒΆ

Endpoints can use get_caller_identity() to support both auth modes:

from app.core.security import get_caller_identity

@router.post("/my-endpoint")
async def my_endpoint(request: Request, db: AsyncSession = Depends(get_db)):
    caller = await get_caller_identity(request, db)

    if caller.is_agent:
        # Called by another agent
        agent_id = caller.id
    else:
        # Called by a human user
        user = caller.user  # Full User model

πŸ§ͺ DevelopmentΒΆ

Running TestsΒΆ

# All tests
poetry run pytest tests/ -v

# With coverage
poetry run pytest tests/ -v --cov=app --cov-report=term

# Unit tests only
poetry run pytest tests/unit/ -v

# Integration tests only
poetry run pytest tests/integration/ -v

Code QualityΒΆ

# Linting
poetry run ruff check app/

# Auto-fix
poetry run ruff check app/ --fix

# Formatting
poetry run ruff format app/

# Type checking
poetry run mypy app/ --ignore-missing-imports

Test Coverage SummaryΒΆ

Category Tests Coverage
Unit Tests 398 Core services, models, utilities
Integration Tests 63 End-to-end flows
Total 461 75%

ConfigurationΒΆ

Environment variables (see .env.example):

Variable Description Required
DATABASE_URL PostgreSQL connection string Yes
REDIS_URL Redis connection string Yes
LLM_BASE_URL Base URL for LLM calls (default: Makakoo SwitchAI) No
EMBEDDING_BASE_URL Base URL for embeddings (default: Makakoo SwitchAI) No
EMBEDDING_MODEL Embedding model (default: mistral-embed) No
EMBEDDING_DIMENSIONS Vector dimensions - set before first migration! (default: 1024) No
OPENAI_API_KEY OpenAI API key (optional fallback) No
ANTHROPIC_API_KEY Anthropic API key (optional fallback) No
SENTINEL_URL Traylinx Sentinel URL No
STM_TOKEN_BUDGET Short-term memory budget (default: 4000) No
LTM_TOKEN_BUDGET Long-term memory budget (default: 1000) No
MODEL_FAST Fast model with provider prefix (e.g., openai/gemini-2.5-flash) No
MODEL_BALANCED Balanced model with provider prefix (e.g., openai/llama-3.3-70b-versatile) No
MODEL_POWERFUL Powerful model with provider prefix (e.g., openai/deepseek-r1-distill-llama-70b) No
CELERY_RESULT_BACKEND Redis URL for Celery task results (default: redis://redis:6379/2) No

⚠️ Provider Prefixes: When using LiteLLM with a proxy (like SwitchAI), model names must include provider prefixes (e.g., openai/, anthropic/). See .env.example for details.

πŸ“– Full configuration guide: See Integration Guide


🐳 Docker Deployment¢

Docker Compose (Development)ΒΆ

# Start all services
docker-compose up -d

# View logs
docker-compose logs -f cortex-api

# Stop
docker-compose down

ServicesΒΆ

Service Port Description
cortex-api 8000 Main API server
postgres 5432 PostgreSQL with pgvector
redis 6379 Redis for STM
celery-worker - Background task processor

Production DeploymentΒΆ

# Build production image
docker build -t traylinx-cortex:latest .

# Run with production settings
docker run -d \
  -p 8000:8000 \
  -e DATABASE_URL=... \
  -e REDIS_URL=... \
  -e LLM_BASE_URL=https://your-llm-proxy/v1 \
  -e EMBEDDING_BASE_URL=https://your-embedding-proxy/v1 \
  -e EMBEDDING_MODEL=mistral-embed \
  traylinx-cortex:latest

MonitoringΒΆ

Structured LoggingΒΆ

All logs are JSON-formatted with trace IDs:

{
  "timestamp": "2024-11-27T10:00:00Z",
  "level": "INFO",
  "service": "traylinx-cortex",
  "trace_id": "abc-123",
  "event": "llm_call",
  "context": {
    "model": "gpt-4o",
    "tokens_in": 100,
    "tokens_out": 50,
    "latency_ms": 1234.5,
    "cost_usd": 0.002
  }
}

Health EndpointsΒΆ

Endpoint Purpose
/health Basic alive check
/health/ready Full readiness (DB, Redis, dependencies)
/health/live Kubernetes liveness probe

πŸ“ Project StructureΒΆ

traylinx_cortex/
β”œβ”€β”€ app/                          # Application code
β”‚   β”œβ”€β”€ api/                      # REST & A2A endpoints
β”‚   β”‚   β”œβ”€β”€ v1/                   # Version 1 API
β”‚   β”‚   β”‚   β”œβ”€β”€ chat.py          # Chat endpoints
β”‚   β”‚   β”‚   β”œβ”€β”€ session.py       # Session management
β”‚   β”‚   β”‚   └── health.py        # Health checks
β”‚   β”‚   └── a2a/                  # Agent-to-Agent
β”‚   β”‚       └── conversation.py  # A2A protocol
β”‚   β”œβ”€β”€ core/                     # Core utilities
β”‚   β”‚   β”œβ”€β”€ config.py            # Settings management
β”‚   β”‚   β”œβ”€β”€ database.py          # DB connection
β”‚   β”‚   β”œβ”€β”€ errors.py            # Custom exceptions
β”‚   β”‚   β”œβ”€β”€ logging.py           # Structured logging
β”‚   β”‚   └── security.py          # Authentication
β”‚   β”œβ”€β”€ models/                   # Data models
β”‚   β”‚   β”œβ”€β”€ api.py               # Pydantic schemas
β”‚   β”‚   └── db.py                # SQLAlchemy models
β”‚   β”œβ”€β”€ services/                 # Business logic
β”‚   β”‚   β”œβ”€β”€ embeddings.py        # Vector embeddings
β”‚   β”‚   β”œβ”€β”€ llm.py               # LLM routing
β”‚   β”‚   β”œβ”€β”€ memory.py            # STM & LTM managers
β”‚   β”‚   β”œβ”€β”€ orchestrator.py      # LangGraph state machine
β”‚   β”‚   β”œβ”€β”€ pii_scrubber.py      # Privacy protection
β”‚   β”‚   └── token_counter.py     # Token estimation
β”‚   β”œβ”€β”€ workers/                  # Background tasks
β”‚   β”‚   β”œβ”€β”€ celery_app.py        # Celery config
β”‚   β”‚   └── tasks.py             # Consolidation tasks
β”‚   └── main.py                   # FastAPI app
β”œβ”€β”€ docs/                         # Documentation
β”‚   β”œβ”€β”€ QUICK_REFERENCE.md       # Developer cheat sheet
β”‚   β”œβ”€β”€ api_reference.md         # API documentation
β”‚   β”œβ”€β”€ DATABASE_SCHEMA.md       # ERD diagrams
β”‚   └── integration_guide.md     # Integration guide
β”œβ”€β”€ migrations/                   # Alembic migrations
β”œβ”€β”€ tests/                        # Test suite
β”‚   β”œβ”€β”€ unit/                    # Unit tests (388)
β”‚   └── integration/             # Integration tests (63)
β”œβ”€β”€ Dockerfile                    # Production build
β”œβ”€β”€ docker-compose.yml            # Full stack
β”œβ”€β”€ pyproject.toml               # Dependencies
└── .env.example                  # Config template

🀝 Contributing¢

  1. Read the Quick Reference for overview
  2. Follow the Implementation Plan
  3. All code must pass ruff check and ruff format
  4. Minimum 75% test coverage for new code
  5. Write property-based tests for critical logic
  6. Use conventional commits

πŸ“ LicenseΒΆ

Β© 2025 Traylinx. All rights reserved.


**[πŸ“‹ Quick Reference](./cortex/QUICK_REFERENCE.md)** β€’ **[πŸ“– API Docs](./cortex/api_reference.md)** β€’ **[πŸ—„οΈ Database Schema](./cortex/DATABASE_SCHEMA.md)**

Production Ready β€’ Contact: traylinx.com