Infrastructure

The ACP infrastructure is designed for low-latency, globally distributed consensus execution. The production stack runs on Cloudflare's edge network with Vectorize for semantic search and KV for caching. A Python FastAPI backend provides full engine access for local development and self-hosted deployments.

Deployment Architecture

The production deployment distributes the ACP stack across two primary platforms: Vercel for the frontend and Cloudflare for the edge API. The three GitHub repositories serve as the source of truth for code, prompts, and axiom data.

Production Deployment Architecture
  +--------------------+      +--------------------+
  |   GitHub Repo 1    |      |   GitHub Repo 2    |
  |   ACP-PROJECT      |      |   ACP-PROMPTS      |
  +--------+-----------+      +--------+-----------+
           |                           |
           | GitHub Actions            | Raw URL fetch
           | CI/CD                     |
           v                           v
  +--------------------+      +--------------------+
  |   Vercel           |      |   Cloudflare       |
  |   Frontend         |<-----|   Worker API       |
  |   axiomprotocol.org|      |   Edge Functions   |
  +--------------------+      +--------+-----------+
                                       |
                              +--------+----------+
                              |                   |
                              v                   v
                      +---------------+   +---------------+
                      |  Vectorize    |   | KV Namespaces |
                      |  (3 indexes)  |   | - Cache       |
                      |  - axioms     |   | - Datasets    |
                      |  - queries    |   | - Rate limits |
                      |  - results    |   +---------------+
                      +-------+-------+
                              |
                      +-------+--------+
                      |  GitHub Repo 3 |
                      |  ACP-DATASETS  |
                      +----------------+

Cloudflare Workers (Edge API)

The Cloudflare Worker is the production API gateway for ACP. It runs on Cloudflare's global edge network, providing sub-100ms latency worldwide. The Worker handles the complete consensus pipeline: authentication, prompt loading, axiom retrieval, LLM orchestration, D-score calculation, and result caching.

PropertyValue
RuntimeCloudflare Workers (V8 isolate)
LanguageJavaScript / TypeScript
DeploymentGlobal edge -- 300+ data centers
Latency< 100ms to nearest edge node
ScalingAutomatic -- serverless, zero cold starts
Local devwrangler dev at http://localhost:8787

Worker Endpoints

MethodPathDescription
POST/consensus-iterativeRun iterative consensus with phi-spiral through axiom levels
GET/axioms/search?q=Semantic search over axioms via Vectorize
POST/cache/checkCheck semantic cache for a previously computed result
GET/cache/statsCache hit rate, size, and performance statistics
POST/embeddingsGenerate text embeddings via Cloudflare AI
GET/similar?q=Find similar past queries for context
GET/healthHealth check with Vectorize and KV status
POST/seed-axiomsBulk seed axioms from ACP-DATASETS into Vectorize
GET/metricsPrometheus-format metrics for monitoring

Consensus Request Flow

POST /consensus-iterative request
{
  "query": "What is the fastest sorting algorithm?",
  "models": [
    "openai/gpt-5.4",
    "anthropic/claude-sonnet-4-6",
    "google/gemini-2.5-flash"
  ],
  "structure": "sonata",
  "max_iterations": 7
}
Consensus response
{
  "consensus_reached": true,
  "final_answer": "QuickSort has O(n log n) average complexity...",
  "final_D": 0.08,
  "iterations_used": 3,
  "convergence_path": [0.35, 0.18, 0.08],
  "axioms_used": [
    "acp-comp-quicksort-avg-v1",
    "acp-comp-timsort-python-v1"
  ],
  "proof": "Verified via MathOracle + WikidataOracle",
  "structure_used": "sonata",
  "timestamp": "2026-04-09T12:00:00Z"
}

Cloudflare Vectorize

Vectorize is Cloudflare's vector database, used by ACP for semantic search over the axiom corpus. All axioms are embedded as 768-dimensional vectors using the bge-base-en-v1.5 model via Cloudflare AI, enabling sub-millisecond approximate nearest neighbor (ANN) search.

IndexPurposeDimensionsVectors
VECTORIZE_AXIOMSAxiom semantic search -- find relevant axioms for a query768Dynamic
VECTORIZE_QUERIESQuery deduplication -- find similar past queries768Dynamic
VECTORIZE_RESULTSResult caching -- semantic cache for consensus outputs768Dynamic

Embedding Model

PropertyValue
Modelbge-base-en-v1.5 (BAAI)
ProviderCloudflare AI (built-in)
Dimensions768
Sequence length512 tokens
NormalizationL2 normalized
Search metricCosine similarity

When a query arrives, the Worker generates an embedding, queries the Vectorize axiom index with level filtering, and returns the top-K most relevant axioms. Results include the axiom statement, level, domain, confidence score, and semantic similarity score.

Axiom search via Vectorize
// 1. Generate embedding for the user query
const embedding = await env.AI.run(
  '@cf/baai/bge-base-en-v1.5',
  { text: [query] }
);

// 2. Query Vectorize with level filtering
const results = await env.VECTORIZE_AXIOMS.query(
  embedding.data[0],
  {
    topK: 5,
    filter: { level: [4, 5, 6, 7] },
    returnMetadata: true
  }
);

// 3. Return axioms with similarity scores
return results.matches.map(match => ({
  id: match.id,
  statement: match.metadata.statement,
  level: match.metadata.level,
  domain: match.metadata.domain,
  confidence: match.metadata.confidence,
  similarity: match.score
}));

KV Storage (Caching)

Cloudflare KV provides key-value storage at the edge for three primary caching functions: semantic result caching, dataset metadata storage, and rate limiting.

NamespacePurposeTTL
ACP_CACHEConsensus result cache -- stores full results keyed by query embedding hash24 hours
ACP_DATASETSAxiom metadata cache -- frequently accessed axiom data7 days
ACP_RATE_LIMITSRate limiting counters per API key1 hour window

The semantic cache is a performance optimization. Before running a full consensus pipeline (which involves multiple LLM API calls), the Worker checks whether a semantically similar query has been resolved recently. If a cached result exists with sufficient similarity (cosine score > 0.95), it is returned immediately, saving time and API costs.

Cache Hit Rates

In production, the semantic cache achieves hit rates of 15-25% for common query patterns. This is especially effective for factual queries (Level 1-4 axioms) where the same questions are asked frequently. The cache is bypassed for Conclave Mode queries, which always require fresh independent responses.

Python FastAPI Backend

The Python backend is the reference implementation of the full ACP v4.0 protocol. It is designed for local development and self-hosted deployments where you need direct access to the consensus engine, oracle verification, and metrics database.

ComponentTechnologyPurpose
Web FrameworkFastAPIAsync HTTP API with OpenAPI documentation
DatabasePostgreSQL + SQLAlchemyMetrics persistence, axiom registry, session history
CacheRedisIn-memory caching, task queue, session state
LLM AdaptershttpxAsync clients for OpenAI, Anthropic, OpenRouter APIs
Consensus EngineCustom PythonPhi-spiral, musical structures, D-score, H-total, C_ij
Oracle SystemCustom PythonHashOracle, MathOracle, WikidataOracle verification
Running the Python backend
# Install dependencies
pip install -r requirements.txt

# Start the API server
uvicorn main:app --reload

# Server runs at http://localhost:8000
# API docs at http://localhost:8000/docs (Swagger)
# Alternative docs at http://localhost:8000/redoc

Python Backend vs. Worker API

Worker API (Production)Python Backend (Local)
DeploymentCloudflare edge (global, serverless)Local / self-hosted (single server)
Latency< 100ms (edge)Depends on server location
CachingVectorize + KV semantic cacheRedis in-memory cache
Axiom searchVectorize ANN searchHTTP to Worker Vectorize endpoint
DatabaseKV (key-value)PostgreSQL (relational)
Oracle accessLimited (external HTTP)Full (direct oracle integration)
Best forProduction traffic, public APIDevelopment, testing, full engine access

LLM Provider Integration

ACP supports multiple LLM providers through a unified adapter interface. The Worker primarily uses OpenRouter as a single gateway to multiple models, while the Python engine supports direct connections to each provider.

ProviderModelsIntegration
OpenRouterGPT-4, Claude, Gemini, Llama, Mistral, and 100+ modelsPrimary gateway for the Worker API
OpenAIGPT-4, GPT-4 Turbo, GPT-3.5Direct adapter in Python engine
AnthropicClaude 3.5 Sonnet, Claude 3 Opus, Claude 3 HaikuDirect adapter in Python engine
MockDeterministic test responsesTesting adapter for CI/CD

CI/CD Pipeline

The deployment pipeline uses GitHub Actions for continuous integration and automated deployment to both Vercel and Cloudflare.

CI/CD Pipeline
  git push to main
        |
        v
  +---------------------+
  |   GitHub Actions     |
  |                     |
  | 1. Lint (ruff,black)|
  | 2. Type check (mypy)|
  | 3. Tests (pytest)   |
  | 4. Build check      |
  +----------+----------+
             |
     +-------+-------+
     |               |
     v               v
  +----------+  +-----------+
  |  Vercel  |  | Cloudflare|
  | Frontend |  |  Worker   |
  | Deploy   |  |  Deploy   |
  +----------+  +-----------+
StageToolPurpose
LintingRuff + BlackPython code style enforcement
Type checkingmypyStatic type analysis for Python source
TestingpytestUnit and integration test suite
Frontend buildNext.js buildVerify frontend compiles without errors
Frontend deployVercelAutomatic deployment on push to main
Worker deployWranglerAutomatic deployment to Cloudflare Workers

Self-Hosted Deployment

For organizations that need to run ACP on their own infrastructure, the Python backend can be deployed as a Docker stack with PostgreSQL and Redis.

docker-compose.yml (simplified)
services:
  acp-backend:
    image: acp/backend:latest
    ports:
      - "8000:8000"
    environment:
      - OPENROUTER_API_KEY=${OPENROUTER_API_KEY}
      - DATABASE_URL=${DATABASE_URL}
      - REDIS_URL=${REDIS_URL}
    depends_on:
      - postgres
      - redis

  postgres:
    image: postgres:15
    environment:
      - POSTGRES_DB=acp
      - POSTGRES_USER=acp
      - POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
    volumes:
      - pgdata:/var/lib/postgresql/data

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"

volumes:
  pgdata:

Axiom Seeding

When self-hosting, you will need to seed axioms into your local environment. Clone ACP-DATASETS into the same parent directory and use the seed script at scripts/vectorize/seed-all-axioms.js to populate the Vectorize index. For fully offline deployments, the Python engine can read axioms directly from JSON files in data/axioms/.

Monitoring and Observability

The Worker exposes a /metrics endpoint in Prometheus format, and the /health endpoint reports the status of Vectorize, KV, and AI bindings. Key metrics to monitor include:

MetricDescription
Consensus success ratePercentage of queries that reach D < threshold
Average iterationsMean number of phi-spiral iterations to consensus
Cache hit ratePercentage of queries served from semantic cache
P95 latency95th percentile response time for consensus requests
LLM error rateFailure rate of upstream LLM API calls
Vectorize query latencyTime to retrieve axioms from the vector index