Code Examples
Five working examples that demonstrate ACP from basic consensus queries through domain-specific custom axioms. Each example includes complete code, expected output metrics, and interpretation guidance.
Prerequisites
- Python 3.11+ installed
- OpenRouter API key -- get one at openrouter.ai/keys
- ACP-PROJECT repository cloned
cd ACP-PROJECT
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
export OPENROUTER_API_KEY="sk-or-v1-YOUR_KEY"
# Windows: set OPENROUTER_API_KEY=sk-or-v1-YOUR_KEYWorking directory
All examples must be run from the project root directory. Running from any other location will cause ModuleNotFoundError: No module named 'src'.
Example 1: Simple Consensus Query
The most basic ACP use case: send a simple factual question to multiple models and observe how they converge on a single answer with a near-zero D-score.
import asyncio
import os
from src.engine import ConsensusEngine, ConsensusConfig
from src.llm.openrouter_llm import OpenRouterLLM
async def main():
api_key = os.environ["OPENROUTER_API_KEY"]
models = [
OpenRouterLLM(api_key=api_key, model="openai/gpt-5.4-mini"),
OpenRouterLLM(api_key=api_key, model="anthropic/claude-haiku-4-5"),
]
config = ConsensusConfig(
max_iterations=5,
D_threshold=0.1,
)
engine = ConsensusEngine(models=models, config=config)
result = await engine.run("What is 2 + 2?")
print(f"Consensus reached: {result.consensus_reached}")
print(f"D-score: {result.final_D:.4f}")
print(f"Iterations: {result.iterations_used}")
print(f"Answer: {result.final_answer}")
asyncio.run(main())Expected output
Consensus reached: True
D-score: 0.0000
Iterations: 1
Answer: 4A D-score of 0.0 indicates perfect agreement. For a trivially verifiable fact like "2 + 2 = 4", all models converge immediately in a single iteration.
Example 2: Fact Checking
Verify factual claims across multiple domains by grounding consensus in specific axiom levels. This example checks mathematical constants, physical constants, algorithm complexity, protocol definitions, and language characteristics.
import asyncio
import os
from src.engine import ConsensusEngine, ConsensusConfig
from src.llm.openrouter_llm import OpenRouterLLM
QUERIES = [
{
"query": "What is the value of Pi to 10 decimal places?",
"axiom_level": [1], # Mathematical axioms
},
{
"query": "What is the speed of light in vacuum in m/s?",
"axiom_level": [2], # Physical axioms
},
{
"query": "What is the average-case time complexity of QuickSort?",
"axiom_level": [4], # Computable axioms
},
{
"query": "What transport protocol does TCP use for reliable delivery?",
"axiom_level": [6], # Protocol axioms
},
{
"query": "Is Python statically or dynamically typed?",
"axiom_level": [7], # Linguistic axioms
},
]
async def main():
api_key = os.environ["OPENROUTER_API_KEY"]
models = [
OpenRouterLLM(api_key=api_key, model="openai/gpt-5.4-mini"),
OpenRouterLLM(api_key=api_key, model="anthropic/claude-haiku-4-5"),
OpenRouterLLM(api_key=api_key, model="google/gemini-flash-1.5"),
]
config = ConsensusConfig(max_iterations=5, D_threshold=0.05)
engine = ConsensusEngine(models=models, config=config)
for item in QUERIES:
result = await engine.run(
query=item["query"],
axiom_level=item["axiom_level"],
)
print(f"Query: {item['query']}")
print(f" D-score: {result.final_D:.4f}")
print(f" Consensus: {result.consensus_reached}")
print(f" Answer: {result.final_answer}")
print()
asyncio.run(main())Expected output
Query: What is the value of Pi to 10 decimal places?
D-score: 0.0000
Consensus: True
Answer: 3.1415926535
Query: What is the speed of light in vacuum in m/s?
D-score: 0.0000
Consensus: True
Answer: 299,792,458 m/s
Query: What is the average-case time complexity of QuickSort?
D-score: 0.0200
Consensus: True
Answer: O(n log n)
Query: What transport protocol does TCP use for reliable delivery?
D-score: 0.0100
Consensus: True
Answer: TCP uses acknowledgments and retransmission for reliable delivery
Query: Is Python statically or dynamically typed?
D-score: 0.0300
Consensus: True
Answer: Python is dynamically typedAll queries achieve consensus well below the 0.05 D-threshold. Mathematical and physical constants achieve perfect 0.0 scores because the answers are unambiguous. Slightly higher D-scores on descriptive questions reflect minor wording variation between models, not disagreement.
Example 3: Code Review
Use consensus to review code for bugs, edge cases, and security vulnerabilities. This example reviews correct implementations, buggy code, algorithm correctness, and security issues.
import asyncio
import os
from src.engine import ConsensusEngine, ConsensusConfig
from src.llm.openrouter_llm import OpenRouterLLM
BUGGY_CODE = """
def binary_search(arr, target):
low, high = 0, len(arr)
while low < high:
mid = (low + high) // 2
if arr[mid] == target:
return mid
elif arr[mid] < target:
low = mid
else:
high = mid
return -1
"""
SQL_CODE = """
def get_user(username):
query = f"SELECT * FROM users WHERE name = '{username}'"
return db.execute(query)
"""
async def main():
api_key = os.environ["OPENROUTER_API_KEY"]
models = [
OpenRouterLLM(api_key=api_key, model="openai/gpt-5.4-mini"),
OpenRouterLLM(api_key=api_key, model="anthropic/claude-haiku-4-5"),
OpenRouterLLM(api_key=api_key, model="google/gemini-flash-1.5"),
]
config = ConsensusConfig(max_iterations=7, D_threshold=0.15)
engine = ConsensusEngine(models=models, config=config)
# Review buggy binary search
result = await engine.run(
query=f"Review this binary search for bugs:\n{BUGGY_CODE}"
)
print(f"Binary Search Review:")
print(f" D-score: {result.final_D:.4f}")
print(f" Finding: {result.final_answer}")
print()
# Review SQL injection vulnerability
result = await engine.run(
query=f"Review this code for security vulnerabilities:\n{SQL_CODE}"
)
print(f"SQL Code Review:")
print(f" D-score: {result.final_D:.4f}")
print(f" Finding: {result.final_answer}")
asyncio.run(main())Expected output
Binary Search Review:
D-score: 0.0800
Finding: Bug found: low = mid should be low = mid + 1 to avoid
infinite loop when target is greater than arr[mid]. The current
implementation will loop forever when low + 1 == high.
SQL Code Review:
D-score: 0.0400
Finding: Critical SQL injection vulnerability. User input is
directly interpolated into the query string via f-string.
Use parameterized queries instead:
db.execute("SELECT * FROM users WHERE name = ?", (username,))The lower D-score on the SQL injection finding reflects stronger consensus -- SQL injection is a well-known vulnerability that all models identify with high confidence. The binary search bug has a slightly higher D-score because models may phrase the diagnosis differently.
Example 4: Research Synthesis
Synthesize information from multiple AI perspectives on architectural decisions, best practices, and technology comparisons. Uses higher iteration limits to allow models to refine and converge on nuanced topics.
import asyncio
import os
from src.engine import ConsensusEngine, ConsensusConfig
from src.llm.openrouter_llm import OpenRouterLLM
RESEARCH_QUERIES = [
"Microservices vs monolith: when should you migrate?",
"What are current best practices for password hashing in 2025?",
"WebSockets vs Server-Sent Events: trade-offs for real-time apps?",
"What scaling strategies work best for read-heavy workloads?",
]
async def main():
api_key = os.environ["OPENROUTER_API_KEY"]
models = [
OpenRouterLLM(api_key=api_key, model="openai/gpt-5.4-mini"),
OpenRouterLLM(api_key=api_key, model="anthropic/claude-haiku-4-5"),
OpenRouterLLM(api_key=api_key, model="google/gemini-flash-1.5"),
]
config = ConsensusConfig(max_iterations=12, D_threshold=0.20)
engine = ConsensusEngine(models=models, config=config)
for query in RESEARCH_QUERIES:
result = await engine.run(query=query)
print(f"Query: {query}")
print(f" D-score: {result.final_D:.4f}")
print(f" Iterations: {result.iterations_used}")
print(f" Consensus: {result.consensus_reached}")
print(f" Summary: {result.final_answer[:150]}...")
print()
asyncio.run(main())Expected output
Query: Microservices vs monolith: when should you migrate?
D-score: 0.1800
Iterations: 8
Consensus: True
Summary: Migrate to microservices when your team exceeds ~15 engineers,
deployment frequency is blocked by monolith coupling, and you need
independent scaling...
Query: What are current best practices for password hashing in 2025?
D-score: 0.0600
Iterations: 3
Consensus: True
Summary: Use Argon2id as the primary recommendation (OWASP). bcrypt
remains acceptable. Key parameters: memory >= 19 MiB, iterations >= 2...
Query: WebSockets vs Server-Sent Events: trade-offs for real-time apps?
D-score: 0.1500
Iterations: 6
Consensus: True
Summary: WebSockets for bidirectional communication (chat, gaming).
SSE for server-to-client streaming (notifications, feeds). SSE is
simpler and works through HTTP proxies...
Query: What scaling strategies work best for read-heavy workloads?
D-score: 0.1200
Iterations: 5
Consensus: True
Summary: Read replicas as the primary strategy, combined with
application-level caching (Redis/Memcached) and CDN for static
assets. Consider CQRS for complex domains...Research queries typically require more iterations (5-8) and produce moderate D-scores (0.10-0.20) because the topics involve nuanced trade-offs. Password hashing converges faster because established standards exist.
Example 5: Custom Axioms
Define domain-specific axioms to constrain consensus within your organization's requirements. This is the most powerful ACP pattern for enterprise use cases -- compliance validation, API design review, and policy enforcement.
import asyncio
import os
from src.engine import ConsensusEngine, ConsensusConfig
from src.llm.openrouter_llm import OpenRouterLLM
async def main():
api_key = os.environ["OPENROUTER_API_KEY"]
models = [
OpenRouterLLM(api_key=api_key, model="openai/gpt-5.4-mini"),
OpenRouterLLM(api_key=api_key, model="anthropic/claude-haiku-4-5"),
OpenRouterLLM(api_key=api_key, model="google/gemini-flash-1.5"),
]
config = ConsensusConfig(max_iterations=7, D_threshold=0.15)
engine = ConsensusEngine(models=models, config=config)
# --- API Design with Performance Constraints ---
api_axioms = [
"All API endpoints must follow RESTful conventions",
"Response time must be under 200ms at P99",
"All endpoints require JWT authentication",
"Pagination is required for list endpoints",
]
result = await engine.run(
query="Evaluate this API design: GET /api/users returns "
"all users without pagination or authentication",
relevant_axioms=api_axioms,
)
print(f"API Design Review:")
print(f" D-score: {result.final_D:.4f}")
print(f" Finding: {result.final_answer}")
print()
# --- Security with Compliance Requirements ---
security_axioms = [
"All data at rest must be encrypted (AES-256)",
"PII must not appear in logs",
"API keys must be rotated every 90 days",
]
result = await engine.run(
query="Our application logs full request bodies including "
"user email and phone number. Is this compliant?",
relevant_axioms=security_axioms,
)
print(f"Security Compliance Review:")
print(f" D-score: {result.final_D:.4f}")
print(f" Finding: {result.final_answer}")
asyncio.run(main())Expected output
API Design Review:
D-score: 0.0500
Finding: Violates 3 of 4 axioms: (1) Missing JWT authentication,
(2) No pagination for list endpoint, (3) Returning all users
will exceed 200ms P99 at scale. Only RESTful convention (GET
for retrieval) is satisfied.
Security Compliance Review:
D-score: 0.0300
Finding: Non-compliant. Logging email and phone number violates
"PII must not appear in logs" axiom. Remediation: implement
PII redaction in the logging pipeline, use structured logging
with field-level filtering.Custom axiom queries achieve low D-scores because the axioms provide explicit, unambiguous criteria for evaluation. This makes ACP particularly effective for compliance and policy enforcement where rules are well-defined.
Running All Examples
# From project root
for example in examples/0*.py; do
echo "Running $example..."
python "$example"
echo "---"
doneUnderstanding the Output
All examples produce the same core metrics. The table below explains each field.
| Metric | Meaning | Range |
|---|---|---|
D-score | Divergence between models. Lower is better. | 0 (perfect consensus) to 1 (total disagreement) |
H_total | Total harmony score. Higher is better. | 0 (discord) to 1 (perfect harmony) |
consensus_reached | Whether D-score fell below the configured threshold. | True / False |
iterations_used | Number of convergence rounds performed. | 1 to max_iterations |
musical_interval | Metaphorical harmony level based on D-score. | unison, octave, fifth, fourth, third, tritone |
D-Score Interpretation Guide
The D-score is the primary consensus metric. Its meaning depends on context -- a D-score of 0.15 is excellent for a subjective technology decision but concerning for a mathematical fact.
| D-Score Range | Quality | Typical Use Cases |
|---|---|---|
< 0.05 | Excellent | Verifiable facts, mathematical constants, well-known algorithms |
< 0.15 | Good | Code review findings, established best practices, clear standards |
< 0.30 | Moderate | Technology comparisons, architectural trade-offs, research synthesis |
> 0.30 | Weak | Highly subjective topics, poorly defined questions, insufficient iterations |
Interpreting high D-scores
A D-score above 0.30 does not necessarily mean the result is wrong. It may indicate a genuinely controversial topic, an ambiguous question, or that more iterations are needed. Try increasing max_iterations or rephrasing the query with more specificity.
Cost Considerations
Pricing note
Costs are approximate and based on OpenRouter pricing as of January 2026. Always check current rates at openrouter.ai/models.
| Example | Typical Cost | Duration |
|---|---|---|
| Simple Query | ~$0.02 | 5-10s |
| Fact Checking (5 queries) | ~$0.15 | 30-60s |
| Code Review (2 queries) | ~$0.20 | 60-90s |
| Research Synthesis (4 queries) | ~$0.30 | 90-120s |
| Custom Axioms (2 queries) | ~$0.20 | 60-90s |
| Total for all examples | ~$0.87 | ~5 min |
Common Issues
"OPENROUTER_API_KEY environment variable not set"
export OPENROUTER_API_KEY="sk-or-v1-..."
# Windows: set OPENROUTER_API_KEY=sk-or-v1-..."ModuleNotFoundError: No module named 'src'"
You are running from the wrong directory. Always run examples from the project root:
cd /path/to/ACP-PROJECT
python examples/01_simple_query.py"Rate limit exceeded"
OpenRouter has per-minute rate limits. Wait 60 seconds between example runs or upgrade your plan at openrouter.ai/settings/limits.
Next Steps
- Modify queries -- try your own questions in any example
- Adjust parameters -- change
max_iterations,D_threshold, andstructure - Add models -- try different LLMs from OpenRouter
- Create custom examples -- use these as templates for your own use cases
Further reading
See the Use Cases guide for 10 detailed application scenarios, or the Worker API reference for the complete endpoint specification.