Skip to content

๐Ÿณ Cookbook: Agentic Upload Engine

Build an intelligent file ingestion system that automatically extracts, translates, and understands uploaded documents using OCR and AI agents.

๐ŸŽฏ What You'll Build

A service that: 1. Accepts file uploads (PDF, images, JSON, CSV) 2. Automatically performs OCR on images 3. Translates content if needed 4. Returns structured, searchable data


๐Ÿ—๏ธ Phase 1: Clone & Setup

1. Get the Code

git clone https://github.com/traylinx/agentic-upload-engines.git
cd agentic-upload-engines

2. Install Dependencies

pip install -r requirements.txt

3. Configure Environment

cp .env.example .env
nano .env

Key Variables:

# LLM Configuration
LLM_MODEL=gpt-4o
LLM_API_KEY=your-api-key
LLM_BASE_URL=https://switchai.traylinx.com/v1

# File Engine (Required!)
FILE_ENGINE_BASE_URL=https://api.traylinx.com/file-engine


๐Ÿš€ Phase 2: Run the Service

uvicorn api:app --reload
# Server running at http://localhost:8000

๐Ÿงช Phase 3: Upload a Simple File

Upload a PDF

curl -X POST http://localhost:8000/v1/upload \
  -H "Authorization: Bearer your-token" \
  -F "file=@invoice.pdf"

Response:

{
  "file_id": "file_abc123",
  "filename": "invoice.pdf",
  "extracted_text": "Invoice #12345\nDate: 2025-01-15\nTotal: $500.00",
  "metadata": {
    "pages": 1,
    "has_images": false
  }
}


๐Ÿ“ท Phase 4: OCR Processing

Upload an image and automatically extract text.

Upload an Image

curl -X POST http://localhost:8000/v1/upload \
  -H "Authorization: Bearer your-token" \
  -F "file=@receipt.jpg"

The engine: 1. Detects it's an image 2. Sends to OCR processor 3. Returns extracted text


๐ŸŒ Phase 5: Upload + Translate

Process a document AND translate it in one request.

curl -X POST http://localhost:8000/v1/upload \
  -H "Authorization: Bearer your-token" \
  -F "file=@german_contract.pdf" \
  -F "translate_to=en"

Response:

{
  "file_id": "file_xyz789",
  "original_text": "Vertrag zwischen...",
  "translated_text": "Contract between...",
  "source_language": "de",
  "target_language": "en"
}


๐Ÿ”„ The Processing Pipeline

          File Upload
             โ”‚
             โ–ผ
     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
     โ”‚ Type Detector โ”‚ โ† PDF? Image? JSON?
     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
             โ”‚
     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
     โ”‚               โ”‚
     โ–ผ               โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   OCR   โ”‚   โ”‚  Parser   โ”‚
โ”‚ (Images)โ”‚   โ”‚ (Text/PDF)โ”‚
โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”˜
     โ”‚              โ”‚
     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
            โ–ผ
     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
     โ”‚ Translator? โ”‚ โ† If translate_to provided
     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜
            โ–ผ
     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
     โ”‚   Output    โ”‚
     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ” Phase 6: A2A Authentication

Enable machine-to-machine file uploads.

Add Credentials to .env

TRAYLINX_CLIENT_ID=ag-xxx
TRAYLINX_CLIENT_SECRET=ts-xxx

Upload as an Agent

curl -X POST http://localhost:8000/v1/upload \
  -H "X-Agent-Secret-Token: your-token" \
  -H "X-Agent-User-Id: your-agent-id" \
  -F "file=@data.csv"

๐Ÿ“Š Supported File Types

Type Extensions Processing
Documents .pdf, .docx Text extraction
Images .jpg, .png, .webp OCR
Data .json, .csv Parsing
Text .txt, .md Direct read

๐Ÿ“š Next Steps

  • Build a file search index with vector embeddings
  • Add webhook callbacks for async processing
  • Integrate with the RAG Cookbook for document Q&A