Frontier models
Multimodal reasoning models cross the agentic-execution threshold
The newest GPT-class and Gemini-class systems now plan, call tools, and verify their own outputs in
multi-hour loops. Benchmarks shift from single-turn QA to end-to-end task completion, with
METR-style evaluations becoming the default measure of capability.
IndustryApril 2026
Regulation
EU AI Act enters its general-purpose enforcement phase
August 2026 deadlines push frontier-model providers to publish system cards, training-data
summaries, and post-market monitoring plans. Smaller open-source labs lobby for proportionate
obligations.
PolicyBrussels
Open source
Llama 4 and DeepSeek-V4 close the gap on closed frontier models
Open-weight releases now ship with native tool-use, long-context retrieval, and reasoning traces.
Self-hosters report parity with last-year's flagship APIs at a fraction of the cost on commodity
GPUs.
ModelsOpen weights
Agents
The agent-framework consolidation has begun
LangGraph, CrewAI, AutoGen, and Microsoft's Agent Framework converge on similar primitives:
typed contracts, durable state, retry policies, and supervisor patterns — exactly the shape XCOM
has run on since launch.
ToolingFrameworks
Infrastructure
Inference becomes the dominant AI compute cost
As reasoning models burn tokens by the millions per task, hyperscalers reroute GPU capacity from
training to inference. Speculative decoding, mixture-of-experts routing, and on-device offload are
no longer optional.
ComputeHyperscale
Security
Prompt injection moves from research curiosity to top-tier threat
Real-world incidents involving exfiltration via tool-calling agents push OWASP to publish a
dedicated LLM Top-10 update. Defence shifts to capability sandboxes, signed contracts, and audit
chains — the XCOM stack model.
OWASPThreat intel