Infrastructure

The ACP infrastructure is designed for low-latency, globally distributed consensus execution. The production stack runs on Cloudflare's edge network with Vectorize for semantic search and KV for caching. A Python FastAPI backend provides full engine access for local development and self-hosted deployments.

Deployment Architecture

The production deployment distributes the ACP stack across two primary platforms: Vercel for the frontend and Cloudflare for the edge API. The three GitHub repositories serve as the source of truth for code, prompts, and axiom data.

Production Deployment Architecture

  +--------------------+      +--------------------+
  |   GitHub Repo 1    |      |   GitHub Repo 2    |
  |   ACP-PROJECT      |      |   ACP-PROMPTS      |
  +--------+-----------+      +--------+-----------+
           |                           |
           | GitHub Actions            | Raw URL fetch
           | CI/CD                     |
           v                           v
  +--------------------+      +--------------------+
  |   Vercel           |      |   Cloudflare       |
  |   Frontend         |<-----|   Worker API       |
  |   axiomprotocol.org|      |   Edge Functions   |
  +--------------------+      +--------+-----------+
                                       |
                              +--------+----------+
                              |                   |
                              v                   v
                      +---------------+   +---------------+
                      |  Vectorize    |   | KV Namespaces |
                      |  (3 indexes)  |   | - Cache       |
                      |  - axioms     |   | - Datasets    |
                      |  - queries    |   | - Rate limits |
                      |  - results    |   +---------------+
                      +-------+-------+
                              |
                      +-------+--------+
                      |  GitHub Repo 3 |
                      |  ACP-DATASETS  |
                      +----------------+

Cloudflare Workers (Edge API)

The Cloudflare Worker is the production API gateway for ACP. It runs on Cloudflare's global edge network, providing sub-100ms latency worldwide. The Worker handles the complete consensus pipeline: authentication, prompt loading, axiom retrieval, LLM orchestration, D-score calculation, and result caching.

Property	Value
Runtime	Cloudflare Workers (V8 isolate)
Language	JavaScript / TypeScript
Deployment	Global edge -- 300+ data centers
Latency	< 100ms to nearest edge node
Scaling	Automatic -- serverless, zero cold starts
Local dev	wrangler dev at http://localhost:8787

Worker Endpoints

Method	Path	Description
`POST`	`/consensus-iterative`	Run iterative consensus with phi-spiral through axiom levels
`GET`	`/axioms/search?q=`	Semantic search over axioms via Vectorize
`POST`	`/cache/check`	Check semantic cache for a previously computed result
`GET`	`/cache/stats`	Cache hit rate, size, and performance statistics
`POST`	`/embeddings`	Generate text embeddings via Cloudflare AI
`GET`	`/similar?q=`	Find similar past queries for context
`GET`	`/health`	Health check with Vectorize and KV status
`POST`	`/seed-axioms`	Bulk seed axioms from ACP-DATASETS into Vectorize
`GET`	`/metrics`	Prometheus-format metrics for monitoring

Consensus Request Flow

POST /consensus-iterative request

{
  "query": "What is the fastest sorting algorithm?",
  "models": [
    "openai/gpt-5.4",
    "anthropic/claude-sonnet-4-6",
    "google/gemini-2.5-flash"
  ],
  "structure": "sonata",
  "max_iterations": 7
}

Consensus response

{
  "consensus_reached": true,
  "final_answer": "QuickSort has O(n log n) average complexity...",
  "final_D": 0.08,
  "iterations_used": 3,
  "convergence_path": [0.35, 0.18, 0.08],
  "axioms_used": [
    "acp-comp-quicksort-avg-v1",
    "acp-comp-timsort-python-v1"
  ],
  "proof": "Verified via MathOracle + WikidataOracle",
  "structure_used": "sonata",
  "timestamp": "2026-04-09T12:00:00Z"
}

Cloudflare Vectorize

Vectorize is Cloudflare's vector database, used by ACP for semantic search over the axiom corpus. All axioms are embedded as 768-dimensional vectors using the bge-base-en-v1.5 model via Cloudflare AI, enabling sub-millisecond approximate nearest neighbor (ANN) search.

Index	Purpose	Dimensions	Vectors
`VECTORIZE_AXIOMS`	Axiom semantic search -- find relevant axioms for a query	768	Dynamic
`VECTORIZE_QUERIES`	Query deduplication -- find similar past queries	768	Dynamic
`VECTORIZE_RESULTS`	Result caching -- semantic cache for consensus outputs	768	Dynamic

Embedding Model

Property	Value
Model	bge-base-en-v1.5 (BAAI)
Provider	Cloudflare AI (built-in)
Dimensions	768
Sequence length	512 tokens
Normalization	L2 normalized
Search metric	Cosine similarity

Axiom Search Pipeline

When a query arrives, the Worker generates an embedding, queries the Vectorize axiom index with level filtering, and returns the top-K most relevant axioms. Results include the axiom statement, level, domain, confidence score, and semantic similarity score.

Axiom search via Vectorize

// 1. Generate embedding for the user query
const embedding = await env.AI.run(
  '@cf/baai/bge-base-en-v1.5',
  { text: [query] }
);

// 2. Query Vectorize with level filtering
const results = await env.VECTORIZE_AXIOMS.query(
  embedding.data[0],
  {
    topK: 5,
    filter: { level: [4, 5, 6, 7] },
    returnMetadata: true
  }
);

// 3. Return axioms with similarity scores
return results.matches.map(match => ({
  id: match.id,
  statement: match.metadata.statement,
  level: match.metadata.level,
  domain: match.metadata.domain,
  confidence: match.metadata.confidence,
  similarity: match.score
}));

KV Storage (Caching)

Cloudflare KV provides key-value storage at the edge for three primary caching functions: semantic result caching, dataset metadata storage, and rate limiting.

Namespace	Purpose	TTL
`ACP_CACHE`	Consensus result cache -- stores full results keyed by query embedding hash	24 hours
`ACP_DATASETS`	Axiom metadata cache -- frequently accessed axiom data	7 days
`ACP_RATE_LIMITS`	Rate limiting counters per API key	1 hour window

The semantic cache is a performance optimization. Before running a full consensus pipeline (which involves multiple LLM API calls), the Worker checks whether a semantically similar query has been resolved recently. If a cached result exists with sufficient similarity (cosine score > 0.95), it is returned immediately, saving time and API costs.

Cache Hit Rates

In production, the semantic cache achieves hit rates of 15-25% for common query patterns. This is especially effective for factual queries (Level 1-4 axioms) where the same questions are asked frequently. The cache is bypassed for Conclave Mode queries, which always require fresh independent responses.

Python FastAPI Backend

The Python backend is the reference implementation of the full ACP v4.0 protocol. It is designed for local development and self-hosted deployments where you need direct access to the consensus engine, oracle verification, and metrics database.

Component	Technology	Purpose
Web Framework	FastAPI	Async HTTP API with OpenAPI documentation
Database	PostgreSQL + SQLAlchemy	Metrics persistence, axiom registry, session history
Cache	Redis	In-memory caching, task queue, session state
LLM Adapters	httpx	Async clients for OpenAI, Anthropic, OpenRouter APIs
Consensus Engine	Custom Python	Phi-spiral, musical structures, D-score, H-total, C_ij
Oracle System	Custom Python	HashOracle, MathOracle, WikidataOracle verification

Running the Python backend

# Install dependencies
pip install -r requirements.txt

# Start the API server
uvicorn main:app --reload

# Server runs at http://localhost:8000
# API docs at http://localhost:8000/docs (Swagger)
# Alternative docs at http://localhost:8000/redoc

Python Backend vs. Worker API

	Worker API (Production)	Python Backend (Local)
Deployment	Cloudflare edge (global, serverless)	Local / self-hosted (single server)
Latency	< 100ms (edge)	Depends on server location
Caching	Vectorize + KV semantic cache	Redis in-memory cache
Axiom search	Vectorize ANN search	HTTP to Worker Vectorize endpoint
Database	KV (key-value)	PostgreSQL (relational)
Oracle access	Limited (external HTTP)	Full (direct oracle integration)
Best for	Production traffic, public API	Development, testing, full engine access

LLM Provider Integration

ACP supports multiple LLM providers through a unified adapter interface. The Worker primarily uses OpenRouter as a single gateway to multiple models, while the Python engine supports direct connections to each provider.

Provider	Models	Integration
OpenRouter	GPT-4, Claude, Gemini, Llama, Mistral, and 100+ models	Primary gateway for the Worker API
OpenAI	GPT-4, GPT-4 Turbo, GPT-3.5	Direct adapter in Python engine
Anthropic	Claude 3.5 Sonnet, Claude 3 Opus, Claude 3 Haiku	Direct adapter in Python engine
Mock	Deterministic test responses	Testing adapter for CI/CD

CI/CD Pipeline

The deployment pipeline uses GitHub Actions for continuous integration and automated deployment to both Vercel and Cloudflare.

CI/CD Pipeline

  git push to main
        |
        v
  +---------------------+
  |   GitHub Actions     |
  |                     |
  | 1. Lint (ruff,black)|
  | 2. Type check (mypy)|
  | 3. Tests (pytest)   |
  | 4. Build check      |
  +----------+----------+
             |
     +-------+-------+
     |               |
     v               v
  +----------+  +-----------+
  |  Vercel  |  | Cloudflare|
  | Frontend |  |  Worker   |
  | Deploy   |  |  Deploy   |
  +----------+  +-----------+

Stage	Tool	Purpose
Linting	Ruff + Black	Python code style enforcement
Type checking	mypy	Static type analysis for Python source
Testing	pytest	Unit and integration test suite
Frontend build	Next.js build	Verify frontend compiles without errors
Frontend deploy	Vercel	Automatic deployment on push to main
Worker deploy	Wrangler	Automatic deployment to Cloudflare Workers

Self-Hosted Deployment

For organizations that need to run ACP on their own infrastructure, the Python backend can be deployed as a Docker stack with PostgreSQL and Redis.

docker-compose.yml (simplified)

services:
  acp-backend:
    image: acp/backend:latest
    ports:
      - "8000:8000"
    environment:
      - OPENROUTER_API_KEY=${OPENROUTER_API_KEY}
      - DATABASE_URL=${DATABASE_URL}
      - REDIS_URL=${REDIS_URL}
    depends_on:
      - postgres
      - redis

  postgres:
    image: postgres:15
    environment:
      - POSTGRES_DB=acp
      - POSTGRES_USER=acp
      - POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
    volumes:
      - pgdata:/var/lib/postgresql/data

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"

volumes:
  pgdata:

Axiom Seeding

When self-hosting, you will need to seed axioms into your local environment. Clone ACP-DATASETS into the same parent directory and use the seed script at scripts/vectorize/seed-all-axioms.js to populate the Vectorize index. For fully offline deployments, the Python engine can read axioms directly from JSON files in data/axioms/.

Monitoring and Observability

The Worker exposes a /metrics endpoint in Prometheus format, and the /health endpoint reports the status of Vectorize, KV, and AI bindings. Key metrics to monitor include:

Metric	Description
Consensus success rate	Percentage of queries that reach D < threshold
Average iterations	Mean number of phi-spiral iterations to consensus
Cache hit rate	Percentage of queries served from semantic cache
P95 latency	95th percentile response time for consensus requests
LLM error rate	Failure rate of upstream LLM API calls
Vectorize query latency	Time to retrieve axioms from the vector index

ACP-PROMPTS

Guides