π§ Traylinx Cortex
**The Cognitive Core of the Traylinx Ecosystem** A universal, plug-and-play "Brain Module" that provides unified memory, context management, and intelligent LLM routing for any application. **Version:** 2.2.0 | **Tests:** 461 passing | **Coverage:** 75% [](/) [](/) [](/)Documentation NavigationΒΆ
| Document | Description | Audience |
|---|---|---|
| π Quick Reference | One-page developer cheat sheet | Developers |
| π API Reference | Complete endpoint documentation | Developers |
| ποΈ Database Schema | ERD diagrams and data flow | Developers/DBAs |
| π Integration Guide | Connect via Sentinel & A2A | Integrators |
| ποΈ Implementation Plan | Step-by-step build guide | Contributors |
What is Traylinx Cortex?ΒΆ
Traylinx Cortex is the intelligent middleware between your user interface and raw LLM APIs. It manages the entire lifecycle of a conversation, ensuring that your AI agents are not just "chatbots" but context-aware partners with persistent memory.
Core CapabilitiesΒΆ
| Feature | Description |
|---|---|
| π§© Unified Memory | Seamlessly integrates Short-Term (Redis) and Long-Term (PostgreSQL + pgvector) memory |
| π LLM Agnostic | Dynamic routing via LiteLLM - configurable base URLs + per-user API keys for any provider |
| π Drop-in Integration | RESTful API for any Web, Mobile, or IoT application |
| π Enterprise Security | Built-in PII scrubbing, multi-tenancy, and Traylinx Sentinel authentication |
| π Full Observability | LangSmith/OpenTelemetry tracing, structured logging, cost tracking |
ArchitectureΒΆ
Technology Stack:
| Layer | Technology | Purpose |
|---|---|---|
| API | FastAPI 0.115+ | High-performance async REST API |
| Orchestration | LangGraph 0.2+ | State machine for conversation flow |
| LLM Routing | LiteLLM 1.50+ | Multi-provider abstraction (OpenAI, Anthropic, SwitchAI) |
| STM | Redis 7 (or Valkey) | Fast session cache |
| LTM | PostgreSQL 16 + pgvector | Persistent memory with vectors |
| Background | Celery 5.4+ | Async task processing |
| PII | Microsoft Presidio | Privacy protection |
| Auth | Traylinx Sentinel | A2A authentication |
Quick StartΒΆ
PrerequisitesΒΆ
- Python 3.11+
- Poetry
- Docker & Docker Compose
InstallationΒΆ
# Clone and navigate
cd traylinx_cortex
# Install dependencies
poetry install
# Setup environment
cp .env.example .env
# Edit .env with your credentials
# IMPORTANT: Model names must include provider prefixes (e.g., openai/...)
# Start infrastructure
docker-compose up -d
# Run migrations
poetry run alembic upgrade head
# Start service
poetry run uvicorn app.main:app --reload
First RequestΒΆ
# Create a session
curl -X POST http://localhost:8000/v1/session \
-H "Content-Type: application/json" \
-d '{"user_id": "user_123", "app_id": "my_app"}'
# Response: {"session_id": "abc-123-def", "created_at": "..."}
# Chat with memory
curl -X POST http://localhost:8000/v1/chat \
-H "Content-Type: application/json" \
-d '{
"session_id": "abc-123-def",
"message": "Hello, my name is Sebastian and I love pizza",
"user_id": "user_123"
}'
# Stream responses (SSE)
curl -N http://localhost:8000/v1/chat \
-H "Content-Type: application/json" \
-d '{
"session_id": "abc-123-def",
"message": "What is my name?",
"user_id": "user_123",
"config": {"stream": true}
}'
# Use per-user API key and embedding model
curl -X POST http://localhost:8000/v1/chat \
-H "Content-Type: application/json" \
-d '{
"session_id": "abc-123-def",
"message": "Hello!",
"user_id": "user_123",
"config": {
"switch_ai_api_key": "sk-user-specific-key",
"embedding_model": "mistral-embed"
}
}'
π More examples: See API Reference for complete endpoint documentation.
API ReferenceΒΆ
SessionsΒΆ
| Endpoint | Method | Description |
|---|---|---|
/v1/session |
POST | Create new conversation session |
/v1/session/{id} |
GET | Get session details |
/v1/session/{id} |
DELETE | End session |
/v1/session/{id}/messages |
GET | List session messages |
ChatΒΆ
| Endpoint | Method | Description |
|---|---|---|
/v1/chat |
POST | Send message and get response |
/v1/chat |
POST (stream=true) | Stream response via SSE |
A2A (Agent-to-Agent)ΒΆ
| Endpoint | Method | Description |
|---|---|---|
/a2a/conversation |
POST | A2A envelope protocol |
MemoryΒΆ
| Endpoint | Method | Description |
|---|---|---|
/v1/memory/search |
POST | Semantic memory search |
/v1/memory/me |
GET | List user memories |
/v1/memory/{id} |
DELETE | Delete single memory |
/v1/memory/duplicates |
GET | Find duplicate memory groups |
/v1/memory/deduplicate |
POST | Merge duplicate memories |
HealthΒΆ
| Endpoint | Method | Description |
|---|---|---|
/health |
GET | Basic health check |
/health/ready |
GET | Readiness probe |
/health/live |
GET | Liveness probe |
π Full documentation: OpenAPI docs available at
/docswhen running.
AuthenticationΒΆ
Cortex supports two authentication methods:
Option 1: User Authentication (Human API)ΒΆ
For human users accessing via UI or apps:
curl -X POST "http://localhost:8000/v1/chat" \
-H "Authorization: Bearer <user_token>" \
-H "Content-Type: application/json" \
-d '{"session_id": "abc-123", "message": "Hello"}'
| Header | Value | Description |
|---|---|---|
Authorization |
Bearer <token> |
User token validated against the database |
Option 2: Agent Authentication (A2A / Machine-to-Machine)ΒΆ
For other agents or services calling Cortex via Traylinx Sentinel:
curl -X POST "http://localhost:8000/v1/chat" \
-H "X-Agent-Secret-Token: <agent_token>" \
-H "X-Agent-User-Id: <agent_id>" \
-H "Content-Type: application/json" \
-d '{"session_id": "abc-123", "message": "Hello"}'
| Header | Value | Description |
|---|---|---|
X-Agent-Secret-Token |
Agent's secret token | Obtained from Traylinx Sentinel |
X-Agent-User-Id |
Agent's UUID | Agent identifier registered with Sentinel |
Authentication FlowΒΆ
Request arrives
β
ββ Has X-Agent-Secret-Token? βββ AGENT MODE (validate via Sentinel)
β
ββ Has Authorization: Bearer? ββ USER MODE (validate via Database)
Note: Agent authentication headers take precedence over Bearer token if both are provided.
Using Dual-Auth in CodeΒΆ
Endpoints can use get_caller_identity() to support both auth modes:
from app.core.security import get_caller_identity
@router.post("/my-endpoint")
async def my_endpoint(request: Request, db: AsyncSession = Depends(get_db)):
caller = await get_caller_identity(request, db)
if caller.is_agent:
# Called by another agent
agent_id = caller.id
else:
# Called by a human user
user = caller.user # Full User model
π§ͺ DevelopmentΒΆ
Running TestsΒΆ
# All tests
poetry run pytest tests/ -v
# With coverage
poetry run pytest tests/ -v --cov=app --cov-report=term
# Unit tests only
poetry run pytest tests/unit/ -v
# Integration tests only
poetry run pytest tests/integration/ -v
Code QualityΒΆ
# Linting
poetry run ruff check app/
# Auto-fix
poetry run ruff check app/ --fix
# Formatting
poetry run ruff format app/
# Type checking
poetry run mypy app/ --ignore-missing-imports
Test Coverage SummaryΒΆ
| Category | Tests | Coverage |
|---|---|---|
| Unit Tests | 398 | Core services, models, utilities |
| Integration Tests | 63 | End-to-end flows |
| Total | 461 | 75% |
ConfigurationΒΆ
Environment variables (see .env.example):
| Variable | Description | Required |
|---|---|---|
DATABASE_URL |
PostgreSQL connection string | Yes |
REDIS_URL |
Redis connection string | Yes |
LLM_BASE_URL |
Base URL for LLM calls (default: Makakoo SwitchAI) | No |
EMBEDDING_BASE_URL |
Base URL for embeddings (default: Makakoo SwitchAI) | No |
EMBEDDING_MODEL |
Embedding model (default: mistral-embed) | No |
EMBEDDING_DIMENSIONS |
Vector dimensions - set before first migration! (default: 1024) | No |
OPENAI_API_KEY |
OpenAI API key (optional fallback) | No |
ANTHROPIC_API_KEY |
Anthropic API key (optional fallback) | No |
SENTINEL_URL |
Traylinx Sentinel URL | No |
STM_TOKEN_BUDGET |
Short-term memory budget (default: 4000) | No |
LTM_TOKEN_BUDGET |
Long-term memory budget (default: 1000) | No |
MODEL_FAST |
Fast model with provider prefix (e.g., openai/gemini-2.5-flash) | No |
MODEL_BALANCED |
Balanced model with provider prefix (e.g., openai/llama-3.3-70b-versatile) | No |
MODEL_POWERFUL |
Powerful model with provider prefix (e.g., openai/deepseek-r1-distill-llama-70b) | No |
CELERY_RESULT_BACKEND |
Redis URL for Celery task results (default: redis://redis:6379/2) | No |
β οΈ Provider Prefixes: When using LiteLLM with a proxy (like SwitchAI), model names must include provider prefixes (e.g.,
openai/,anthropic/). See .env.example for details.π Full configuration guide: See Integration Guide
π³ Docker DeploymentΒΆ
Docker Compose (Development)ΒΆ
# Start all services
docker-compose up -d
# View logs
docker-compose logs -f cortex-api
# Stop
docker-compose down
ServicesΒΆ
| Service | Port | Description |
|---|---|---|
cortex-api |
8000 | Main API server |
postgres |
5432 | PostgreSQL with pgvector |
redis |
6379 | Redis for STM |
celery-worker |
- | Background task processor |
Production DeploymentΒΆ
# Build production image
docker build -t traylinx-cortex:latest .
# Run with production settings
docker run -d \
-p 8000:8000 \
-e DATABASE_URL=... \
-e REDIS_URL=... \
-e LLM_BASE_URL=https://your-llm-proxy/v1 \
-e EMBEDDING_BASE_URL=https://your-embedding-proxy/v1 \
-e EMBEDDING_MODEL=mistral-embed \
traylinx-cortex:latest
MonitoringΒΆ
Structured LoggingΒΆ
All logs are JSON-formatted with trace IDs:
{
"timestamp": "2024-11-27T10:00:00Z",
"level": "INFO",
"service": "traylinx-cortex",
"trace_id": "abc-123",
"event": "llm_call",
"context": {
"model": "gpt-4o",
"tokens_in": 100,
"tokens_out": 50,
"latency_ms": 1234.5,
"cost_usd": 0.002
}
}
Health EndpointsΒΆ
| Endpoint | Purpose |
|---|---|
/health |
Basic alive check |
/health/ready |
Full readiness (DB, Redis, dependencies) |
/health/live |
Kubernetes liveness probe |
π Project StructureΒΆ
traylinx_cortex/
βββ app/ # Application code
β βββ api/ # REST & A2A endpoints
β β βββ v1/ # Version 1 API
β β β βββ chat.py # Chat endpoints
β β β βββ session.py # Session management
β β β βββ health.py # Health checks
β β βββ a2a/ # Agent-to-Agent
β β βββ conversation.py # A2A protocol
β βββ core/ # Core utilities
β β βββ config.py # Settings management
β β βββ database.py # DB connection
β β βββ errors.py # Custom exceptions
β β βββ logging.py # Structured logging
β β βββ security.py # Authentication
β βββ models/ # Data models
β β βββ api.py # Pydantic schemas
β β βββ db.py # SQLAlchemy models
β βββ services/ # Business logic
β β βββ embeddings.py # Vector embeddings
β β βββ llm.py # LLM routing
β β βββ memory.py # STM & LTM managers
β β βββ orchestrator.py # LangGraph state machine
β β βββ pii_scrubber.py # Privacy protection
β β βββ token_counter.py # Token estimation
β βββ workers/ # Background tasks
β β βββ celery_app.py # Celery config
β β βββ tasks.py # Consolidation tasks
β βββ main.py # FastAPI app
βββ docs/ # Documentation
β βββ QUICK_REFERENCE.md # Developer cheat sheet
β βββ api_reference.md # API documentation
β βββ DATABASE_SCHEMA.md # ERD diagrams
β βββ integration_guide.md # Integration guide
βββ migrations/ # Alembic migrations
βββ tests/ # Test suite
β βββ unit/ # Unit tests (388)
β βββ integration/ # Integration tests (63)
βββ Dockerfile # Production build
βββ docker-compose.yml # Full stack
βββ pyproject.toml # Dependencies
βββ .env.example # Config template
π€ ContributingΒΆ
- Read the Quick Reference for overview
- Follow the Implementation Plan
- All code must pass
ruff checkandruff format - Minimum 75% test coverage for new code
- Write property-based tests for critical logic
- Use conventional commits
π LicenseΒΆ
Β© 2025 Traylinx. All rights reserved.
Production Ready β’ Contact: traylinx.com