🧠 Traylinx Cortex

**The Cognitive Core of the Traylinx Ecosystem** A universal, plug-and-play "Brain Module" that provides unified memory, context management, and intelligent LLM routing for any application. **Version:** 2.2.0 | **Tests:** 461 passing | **Coverage:** 75% [![Production Ready](https://img.shields.io/badge/Status-Production%20Ready-brightgreen)](/) [![Tests](https://img.shields.io/badge/Tests-461%20passing-success)](/) [![Coverage](https://img.shields.io/badge/Coverage-75%25-blue)](/)

Document	Description	Audience
📋 Quick Reference	One-page developer cheat sheet	Developers
📖 API Reference	Complete endpoint documentation	Developers
🗄️ Database Schema	ERD diagrams and data flow	Developers/DBAs
🔌 Integration Guide	Connect via Sentinel & A2A	Integrators
🏗️ Implementation Plan	Step-by-step build guide	Contributors

What is Traylinx Cortex?¶

Traylinx Cortex is the intelligent middleware between your user interface and raw LLM APIs. It manages the entire lifecycle of a conversation, ensuring that your AI agents are not just "chatbots" but context-aware partners with persistent memory.

Core Capabilities¶

Feature	Description
🧩 Unified Memory	Seamlessly integrates Short-Term (Redis) and Long-Term (PostgreSQL + pgvector) memory
🔄 LLM Agnostic	Dynamic routing via LiteLLM - configurable base URLs + per-user API keys for any provider
🔌 Drop-in Integration	RESTful API for any Web, Mobile, or IoT application
🔒 Enterprise Security	Built-in PII scrubbing, multi-tenancy, and Traylinx Sentinel authentication
📊 Full Observability	LangSmith/OpenTelemetry tracing, structured logging, cost tracking

Architecture¶

graph TB subgraph "Client Layer" WEB[Web App] MOB[Mobile App] A2A[A2A Agents] end subgraph "API Layer" GW[FastAPI Gateway] AUTH[Sentinel Auth] end subgraph "Core Engine" ORCH[LangGraph Orchestrator] LLM[LiteLLM Router] PII[PII Scrubber] end subgraph "Memory Layer" STM[Redis STM] LTM[PostgreSQL + pgvector LTM] EMB[Embeddings Service] end subgraph "Background" CELERY[Celery Workers] CONSOL[Memory Consolidation] end WEB --> GW MOB --> GW A2A --> GW GW --> AUTH AUTH --> ORCH ORCH --> PII ORCH --> STM ORCH --> LTM ORCH --> LLM LTM --> EMB CELERY --> CONSOL CONSOL --> LTM

Technology Stack:

Layer	Technology	Purpose
API	FastAPI 0.115+	High-performance async REST API
Orchestration	LangGraph 0.2+	State machine for conversation flow
LLM Routing	LiteLLM 1.50+	Multi-provider abstraction (OpenAI, Anthropic, SwitchAI)
STM	Redis 7 (or Valkey)	Fast session cache
LTM	PostgreSQL 16 + pgvector	Persistent memory with vectors
Background	Celery 5.4+	Async task processing
PII	Microsoft Presidio	Privacy protection
Auth	Traylinx Sentinel	A2A authentication

Quick Start¶

Prerequisites¶

Python 3.11+
Poetry
Docker & Docker Compose

Installation¶

# Clone and navigate
cd traylinx_cortex

# Install dependencies
poetry install

# Setup environment
cp .env.example .env
# Edit .env with your credentials
# IMPORTANT: Model names must include provider prefixes (e.g., openai/...)

# Start infrastructure
docker-compose up -d

# Run migrations
poetry run alembic upgrade head

# Start service
poetry run uvicorn app.main:app --reload

First Request¶

# Create a session
curl -X POST http://localhost:8000/v1/session \
  -H "Content-Type: application/json" \
  -d '{"user_id": "user_123", "app_id": "my_app"}'

# Response: {"session_id": "abc-123-def", "created_at": "..."}

# Chat with memory
curl -X POST http://localhost:8000/v1/chat \
  -H "Content-Type: application/json" \
  -d '{
    "session_id": "abc-123-def",
    "message": "Hello, my name is Sebastian and I love pizza",
    "user_id": "user_123"
  }'

# Stream responses (SSE)
curl -N http://localhost:8000/v1/chat \
  -H "Content-Type: application/json" \
  -d '{
    "session_id": "abc-123-def",
    "message": "What is my name?",
    "user_id": "user_123",
    "config": {"stream": true}
  }'

# Use per-user API key and embedding model
curl -X POST http://localhost:8000/v1/chat \
  -H "Content-Type: application/json" \
  -d '{
    "session_id": "abc-123-def",
    "message": "Hello!",
    "user_id": "user_123",
    "config": {
      "switch_ai_api_key": "sk-user-specific-key",
      "embedding_model": "mistral-embed"
    }
  }'

📖 More examples: See API Reference for complete endpoint documentation.

API Reference¶

Sessions¶

Endpoint	Method	Description
`/v1/session`	POST	Create new conversation session
`/v1/session/{id}`	GET	Get session details
`/v1/session/{id}`	DELETE	End session
`/v1/session/{id}/messages`	GET	List session messages

Chat¶

Endpoint	Method	Description
`/v1/chat`	POST	Send message and get response
`/v1/chat`	POST (stream=true)	Stream response via SSE

A2A (Agent-to-Agent)¶

Endpoint	Method	Description
`/a2a/conversation`	POST	A2A envelope protocol

Memory¶

Endpoint	Method	Description
`/v1/memory/search`	POST	Semantic memory search
`/v1/memory/me`	GET	List user memories
`/v1/memory/{id}`	DELETE	Delete single memory
`/v1/memory/duplicates`	GET	Find duplicate memory groups
`/v1/memory/deduplicate`	POST	Merge duplicate memories

Health¶

Endpoint	Method	Description
`/health`	GET	Basic health check
`/health/ready`	GET	Readiness probe
`/health/live`	GET	Liveness probe

📖 Full documentation: OpenAPI docs available at /docs when running.

Authentication¶

Cortex supports two authentication methods:

Option 1: User Authentication (Human API)¶

For human users accessing via UI or apps:

curl -X POST "http://localhost:8000/v1/chat" \
  -H "Authorization: Bearer <user_token>" \
  -H "Content-Type: application/json" \
  -d '{"session_id": "abc-123", "message": "Hello"}'

Header	Value	Description
`Authorization`	`Bearer <token>`	User token validated against the database

Option 2: Agent Authentication (A2A / Machine-to-Machine)¶

For other agents or services calling Cortex via Traylinx Sentinel:

curl -X POST "http://localhost:8000/v1/chat" \
  -H "X-Agent-Secret-Token: <agent_token>" \
  -H "X-Agent-User-Id: <agent_id>" \
  -H "Content-Type: application/json" \
  -d '{"session_id": "abc-123", "message": "Hello"}'

Header	Value	Description
`X-Agent-Secret-Token`	Agent's secret token	Obtained from Traylinx Sentinel
`X-Agent-User-Id`	Agent's UUID	Agent identifier registered with Sentinel

Authentication Flow¶

Request arrives
    │
    ├─ Has X-Agent-Secret-Token? ──→ AGENT MODE (validate via Sentinel)
    │
    └─ Has Authorization: Bearer? ─→ USER MODE (validate via Database)

Note: Agent authentication headers take precedence over Bearer token if both are provided.

Using Dual-Auth in Code¶

Endpoints can use get_caller_identity() to support both auth modes:

from app.core.security import get_caller_identity

@router.post("/my-endpoint")
async def my_endpoint(request: Request, db: AsyncSession = Depends(get_db)):
    caller = await get_caller_identity(request, db)

    if caller.is_agent:
        # Called by another agent
        agent_id = caller.id
    else:
        # Called by a human user
        user = caller.user  # Full User model

🧪 Development¶

Running Tests¶

# All tests
poetry run pytest tests/ -v

# With coverage
poetry run pytest tests/ -v --cov=app --cov-report=term

# Unit tests only
poetry run pytest tests/unit/ -v

# Integration tests only
poetry run pytest tests/integration/ -v

Code Quality¶

# Linting
poetry run ruff check app/

# Auto-fix
poetry run ruff check app/ --fix

# Formatting
poetry run ruff format app/

# Type checking
poetry run mypy app/ --ignore-missing-imports

Test Coverage Summary¶

Category	Tests	Coverage
Unit Tests	398	Core services, models, utilities
Integration Tests	63	End-to-end flows
Total	461	75%

Configuration¶

Environment variables (see .env.example):

Variable	Description	Required
`DATABASE_URL`	PostgreSQL connection string	Yes
`REDIS_URL`	Redis connection string	Yes
`LLM_BASE_URL`	Base URL for LLM calls (default: Makakoo SwitchAI)	No
`EMBEDDING_BASE_URL`	Base URL for embeddings (default: Makakoo SwitchAI)	No
`EMBEDDING_MODEL`	Embedding model (default: mistral-embed)	No
`EMBEDDING_DIMENSIONS`	Vector dimensions - set before first migration! (default: 1024)	No
`OPENAI_API_KEY`	OpenAI API key (optional fallback)	No
`ANTHROPIC_API_KEY`	Anthropic API key (optional fallback)	No
`SENTINEL_URL`	Traylinx Sentinel URL	No
`STM_TOKEN_BUDGET`	Short-term memory budget (default: 4000)	No
`LTM_TOKEN_BUDGET`	Long-term memory budget (default: 1000)	No
`MODEL_FAST`	Fast model with provider prefix (e.g., openai/gemini-2.5-flash)	No
`MODEL_BALANCED`	Balanced model with provider prefix (e.g., openai/llama-3.3-70b-versatile)	No
`MODEL_POWERFUL`	Powerful model with provider prefix (e.g., openai/deepseek-r1-distill-llama-70b)	No
`CELERY_RESULT_BACKEND`	Redis URL for Celery task results (default: redis://redis:6379/2)	No

⚠️ Provider Prefixes: When using LiteLLM with a proxy (like SwitchAI), model names must include provider prefixes (e.g., openai/, anthropic/). See .env.example for details.

📖 Full configuration guide: See Integration Guide

🐳 Docker Deployment¶

Docker Compose (Development)¶

# Start all services
docker-compose up -d

# View logs
docker-compose logs -f cortex-api

# Stop
docker-compose down

Services¶

Service	Port	Description
`cortex-api`	8000	Main API server
`postgres`	5432	PostgreSQL with pgvector
`redis`	6379	Redis for STM
`celery-worker`	-	Background task processor

Production Deployment¶

# Build production image
docker build -t traylinx-cortex:latest .

# Run with production settings
docker run -d \
  -p 8000:8000 \
  -e DATABASE_URL=... \
  -e REDIS_URL=... \
  -e LLM_BASE_URL=https://your-llm-proxy/v1 \
  -e EMBEDDING_BASE_URL=https://your-embedding-proxy/v1 \
  -e EMBEDDING_MODEL=mistral-embed \
  traylinx-cortex:latest

Monitoring¶

Structured Logging¶

All logs are JSON-formatted with trace IDs:

{
  "timestamp": "2024-11-27T10:00:00Z",
  "level": "INFO",
  "service": "traylinx-cortex",
  "trace_id": "abc-123",
  "event": "llm_call",
  "context": {
    "model": "gpt-4o",
    "tokens_in": 100,
    "tokens_out": 50,
    "latency_ms": 1234.5,
    "cost_usd": 0.002
  }
}

Health Endpoints¶

Endpoint	Purpose
`/health`	Basic alive check
`/health/ready`	Full readiness (DB, Redis, dependencies)
`/health/live`	Kubernetes liveness probe

📁 Project Structure¶

traylinx_cortex/
├── app/                          # Application code
│   ├── api/                      # REST & A2A endpoints
│   │   ├── v1/                   # Version 1 API
│   │   │   ├── chat.py          # Chat endpoints
│   │   │   ├── session.py       # Session management
│   │   │   └── health.py        # Health checks
│   │   └── a2a/                  # Agent-to-Agent
│   │       └── conversation.py  # A2A protocol
│   ├── core/                     # Core utilities
│   │   ├── config.py            # Settings management
│   │   ├── database.py          # DB connection
│   │   ├── errors.py            # Custom exceptions
│   │   ├── logging.py           # Structured logging
│   │   └── security.py          # Authentication
│   ├── models/                   # Data models
│   │   ├── api.py               # Pydantic schemas
│   │   └── db.py                # SQLAlchemy models
│   ├── services/                 # Business logic
│   │   ├── embeddings.py        # Vector embeddings
│   │   ├── llm.py               # LLM routing
│   │   ├── memory.py            # STM & LTM managers
│   │   ├── orchestrator.py      # LangGraph state machine
│   │   ├── pii_scrubber.py      # Privacy protection
│   │   └── token_counter.py     # Token estimation
│   ├── workers/                  # Background tasks
│   │   ├── celery_app.py        # Celery config
│   │   └── tasks.py             # Consolidation tasks
│   └── main.py                   # FastAPI app
├── docs/                         # Documentation
│   ├── QUICK_REFERENCE.md       # Developer cheat sheet
│   ├── api_reference.md         # API documentation
│   ├── DATABASE_SCHEMA.md       # ERD diagrams
│   └── integration_guide.md     # Integration guide
├── migrations/                   # Alembic migrations
├── tests/                        # Test suite
│   ├── unit/                    # Unit tests (388)
│   └── integration/             # Integration tests (63)
├── Dockerfile                    # Production build
├── docker-compose.yml            # Full stack
├── pyproject.toml               # Dependencies
└── .env.example                  # Config template

🤝 Contributing¶

Read the Quick Reference for overview
Follow the Implementation Plan
All code must pass ruff check and ruff format
Minimum 75% test coverage for new code
Write property-based tests for critical logic
Use conventional commits

📝 License¶

**[📋 Quick Reference](./cortex/QUICK_REFERENCE.md)** • **[📖 API Docs](./cortex/api_reference.md)** • **[🗄️ Database Schema](./cortex/DATABASE_SCHEMA.md)**

Production Ready • Contact: traylinx.com