How we build.
XCOM.DEV ships infrastructure that runs unattended. Methodology is therefore not optional.
1. Agent design
- Single responsibility — every agent owns one role with bounded capabilities.
- Declared capabilities — the registry is the source of truth.
- Composable skills — reasoning, retrieval, exec, web — added without prompt rewrites.
2. Contracts as tests
Each contract between two agents has a typed input, typed output, and a property-based test suite. Pipelines cannot run unless every contract passes.
3. Supervised rollout
- Shadow mode — new agent runs alongside production, output compared offline.
- Canary — 5% of traffic, supervisor enforces strict timeouts and budgets.
- Promotion — circuit-breaker thresholds locked in before full rollout.
4. Continuous evaluation
METR-style task suites run nightly. Regressions block the next deployment. The audit chain provides a complete forensic record of every decision.
5. Risk methodology
See also Security & Compliance (NIS2). We model threats per OWASP LLM Top 10 and run table-top exercises monthly.