Skip to content

NAT Traversal & Circuit Relay v2

Last Updated: January 2026
Stargate Version: v0.2.0+


Overview

Traylinx Stargate v0.2.0 introduces comprehensive NAT traversal capabilities, enabling agents behind firewalls and NAT to communicate reliably through Circuit Relay v2.


The NAT Problem

Most agents run behind Network Address Translation (NAT), which prevents direct peer-to-peer connections:

Agent A (NAT)  ❌  Cannot connect directly  ❌  Agent B (NAT)
     │                                              │
     │         ✅ Solution: Relay Node ✅           │
     │                     │                        │
     └─────────────────────┴────────────────────────┘

Common NAT Scenarios: - Home networks behind routers - Corporate networks with firewalls - Cloud instances with security groups - Mobile devices on cellular networks


Circuit Relay v2

Circuit Relay v2 is a libp2p protocol that enables peers to communicate through public relay nodes.

Architecture

┌─────────────────────────────────────────────────────────────┐
│                    CIRCUIT RELAY v2 FLOW                    │
└─────────────────────────────────────────────────────────────┘

Peer A (NAT)              Relay Node              Peer B (NAT)
     │                    (Public IP)                   │
     │                         │                        │
     │  1. Connect to relay    │                        │
     │────────────────────────▶│                        │
     │                         │  2. Connect to relay   │
     │                         │◀───────────────────────│
     │                         │                        │
     │  3. Request relay to B  │                        │
     │────────────────────────▶│                        │
     │                         │  4. Establish circuit  │
     │                         │───────────────────────▶│
     │                         │                        │
     │  5. Data flows through relay                     │
     │◀────────────────────────┼───────────────────────▶│

Connection Priority

Stargate uses a 3-tier fallback strategy:

  1. Direct P2P (Lowest latency) - Attempts direct connection first
  2. Circuit Relay v2 (Medium latency) - Falls back to relay if direct fails
  3. NATS Relay (Highest reliability) - Final fallback for maximum compatibility

Features

1. Automatic NAT Detection

Stargate automatically detects your NAT configuration:

from traylinx_stargate import StarGateNode

node = StarGateNode(display_name="my-agent")
await node.start()

# Check NAT status
status = node.get_status()
print(f"NAT Type: {status['nat_status']['nat_type']}")
print(f"Public Addresses: {status['nat_status']['public_addrs']}")
print(f"Requires Relay: {status['nat_status']['requires_relay']}")

NAT Types Detected: - public - Direct internet connection - private - Behind NAT (10.x, 172.16-31.x, 192.168.x) - symmetric - Symmetric NAT (requires relay) - unknown - Unable to determine

2. Connection Pooling

Reuses connections to frequently-contacted peers:

# Connections are automatically pooled
result1 = await node.call("peer_id", "method1", {})
result2 = await node.call("peer_id", "method2", {})  # Reuses connection

# Pool configuration
transport._connection_pool.get_stats()
# {'size': 5, 'max_size': 50, 'ttl': 300.0}

Benefits: - Reduced latency for repeated calls - Lower network overhead - Automatic connection cleanup

3. Connection Retry with Exponential Backoff

Automatic retry for transient failures:

# Retry is automatic with configurable parameters
result = await node.call(
    "peer_id",
    "method",
    payload,
    max_retries=3,           # Default: 3
    backoff_factor=2.0       # Default: 2.0 (exponential)
)

Retry Schedule: - Attempt 1: Immediate - Attempt 2: After 2 seconds - Attempt 3: After 4 seconds - Attempt 4: After 8 seconds

4. Comprehensive Metrics

Track connection performance:

# Get metrics for a specific peer
metrics = node.transport.get_metrics("peer_id")

print(f"Connection Type: {metrics.connection_type}")  # direct/relay/nats
print(f"Latency: {metrics.latency_ms}ms")
print(f"Success Rate: {metrics.success_rate * 100}%")
print(f"Total Requests: {metrics.total_requests}")
print(f"Failed Requests: {metrics.failed_requests}")

5. Relay Health Monitoring

Automatic health checks for relay nodes:

# Health checks run every 5 seconds
# Failures detected within 10 seconds

# Get relay health status
health = node.transport.get_relay_health_status()

for relay, status in health.items():
    print(f"Relay: {relay}")
    print(f"  Healthy: {status['is_healthy']}")
    print(f"  Consecutive Failures: {status['consecutive_failures']}")
    print(f"  Last Success: {status['last_success']}")

Using Circuit Relay v2

Quick Start

import asyncio
from traylinx_stargate import StarGateNode

async def main():
    # Create node with libp2p transport
    node = StarGateNode(
        display_name="my-agent",
        transport="libp2p"
    )
    await node.start()

    # Enable Circuit Relay v2 (uses default Traylinx relays)
    await node.transport.enable_circuit_relay_v2()

    # Now you can connect to peers through relays
    result = await node.call("peer_id", "ping", {"message": "hello"})
    print(result)

    await node.stop()

asyncio.run(main())

Custom Relay Nodes

Use your own relay infrastructure:

# Configure custom relay nodes
custom_relays = [
    "/ip4/203.0.113.42/tcp/4001/p2p/QmRelay1...",
    "/ip4/203.0.113.43/tcp/4001/p2p/QmRelay2...",
]

await node.transport.enable_circuit_relay_v2(relay_addrs=custom_relays)

Running a Relay Node

Deploy your own relay node for private networks:

# Start a relay node
traylinx stargate relay --port 4001 --max-connections 1000

# The relay will display its multiaddr:
# Relay node started: QmYourRelayPeerID...
# Multiaddrs: ['/ip4/203.0.113.42/tcp/4001/p2p/QmYourRelayPeerID...']

See Also: Relay Node Deployment Guide


Default Relay Nodes

Stargate includes default relay nodes operated by Traylinx:

Relay Address Location Status
relay1.traylinx.io /ip4/relay1.traylinx.io/tcp/4001/p2p/QmRelay1... US East Planned
relay2.traylinx.io /ip4/relay2.traylinx.io/tcp/4001/p2p/QmRelay2... EU West Planned

Note: Default relay addresses are placeholders and will be updated when production relays are deployed.


Performance Characteristics

Latency Comparison

Connection Type Typical Latency Use Case
Direct P2P 5-50ms Same network or public IPs
Circuit Relay v2 50-200ms NAT traversal required
NATS Relay 100-300ms Maximum compatibility

Resource Usage

Connection Pool: - Default: 50 connections max - TTL: 300 seconds (5 minutes) - Memory: ~2-4 MB per connection

Relay Health Checks: - Interval: 5 seconds - Failure detection: < 10 seconds - Bandwidth: Minimal (~1 KB/check)


Troubleshooting

Connection Failures

Symptom: Cannot connect to peer

Diagnosis:

# Check NAT status
status = node.get_status()
print(status['nat_status'])

# Check relay status
if node.transport._circuit_relay_enabled:
    print(f"Relays: {node.transport._connected_relays}")
else:
    print("Circuit Relay v2 not enabled")

Solutions: 1. Enable Circuit Relay v2: await node.transport.enable_circuit_relay_v2() 2. Check relay health: node.transport.get_relay_health_status() 3. Verify firewall allows outbound connections on port 4001

High Latency

Symptom: Slow response times

Diagnosis:

# Check connection metrics
metrics = node.transport.get_metrics("peer_id")
print(f"Connection Type: {metrics.connection_type}")
print(f"Latency: {metrics.latency_ms}ms")

Solutions: 1. If using relay, deploy geographically closer relay nodes 2. Check network connectivity 3. Consider direct connection if both peers have public IPs

Relay Failures

Symptom: Relay connections failing

Diagnosis:

# Check relay health
health = node.transport.get_relay_health_status()
for relay, status in health.items():
    if not status['is_healthy']:
        print(f"Unhealthy relay: {relay}")
        print(f"Failures: {status['consecutive_failures']}")

Solutions: 1. Configure multiple relay nodes for redundancy 2. Deploy your own relay nodes 3. Check relay node logs for issues


Best Practices

1. Use Multiple Relays

Configure at least 2-3 relay nodes for redundancy:

relays = [
    "/ip4/relay1.example.com/tcp/4001/p2p/QmRelay1...",
    "/ip4/relay2.example.com/tcp/4001/p2p/QmRelay2...",
    "/ip4/relay3.example.com/tcp/4001/p2p/QmRelay3...",
]
await node.transport.enable_circuit_relay_v2(relay_addrs=relays)

2. Monitor Connection Metrics

Track performance over time:

# Periodically check metrics
all_metrics = node.transport.get_metrics()
for peer_id, metrics in all_metrics.items():
    if metrics.success_rate < 0.9:  # Less than 90% success
        print(f"Warning: Low success rate for {peer_id}")

3. Handle Connection Failures Gracefully

Implement retry logic in your application:

async def call_with_fallback(node, peer_id, method, payload):
    try:
        return await node.call(peer_id, method, payload)
    except ConnectionError:
        # Try alternative peer or method
        return await fallback_method(payload)

4. Deploy Relay Nodes for Production

For production deployments, run your own relay infrastructure:

  • Deploy in multiple geographic regions
  • Use high-bandwidth servers (1+ Gbps)
  • Monitor relay health and performance
  • Configure appropriate connection limits

See: Relay Node Deployment Guide


Security Considerations

Relay Node Trust

Relay nodes can see: - Connection metadata (who is connecting to whom) - Message sizes and timing

Relay nodes cannot see: - Message content (encrypted end-to-end) - Private keys or identities

Recommendation: Only use trusted relay nodes or deploy your own.

End-to-End Encryption

All messages are encrypted regardless of connection type:

# Messages are automatically encrypted with Noise protocol
# No additional configuration needed
result = await node.call("peer_id", "method", {"sensitive": "data"})

Rate Limiting

Relay nodes enforce connection limits:

relay:
  max_connections: 1000  # Per relay node
  bandwidth_limit_mbps: 100  # Future feature

API Reference

enable_circuit_relay_v2(relay_addrs=None, hop_enabled=False)

Enable Circuit Relay v2 for NAT traversal.

Parameters: - relay_addrs (list[str] | None): List of relay node multiaddrs. If None, uses default Traylinx relay nodes. - hop_enabled (bool): If True, this node can act as a relay for others. Default: False

Returns: - bool: True if relay was successfully enabled

Raises: - ConnectionError: If not connected to the network

get_metrics(peer_id=None)

Get connection metrics for one or all peers.

Parameters: - peer_id (str | None): Specific peer ID. If None, returns all metrics.

Returns: - dict[str, ConnectionMetrics] or ConnectionMetrics or None

get_relay_health_status(relay_addr=None)

Get health status for relay nodes.

Parameters: - relay_addr (str | None): Specific relay address. If None, returns all relay health status.

Returns: - dict: Health status information


See Also