Service

Agents

Speed and scale decide outcomes. Autonomy is the multiplier.

Cybersecurity AI operates through specialized agents that replicate core security roles — from defense and incident response to penetration testing and network analysis. The key challenge becomes measuring their effectiveness: how well they perform specific security jobs, how they improve over time, and how different agents can be objectively compared on the same labor-relevant tasks.

“The fastest exploit won't be a zero-day. It'll be 1,000 agents iterating.”

Agents
x1,000 iterating

Generations of agents

Cybersecurity AI operates through specialized agents that replicate core security roles — from defense and incident response to penetration testing and network analysis.

The key challenge becomes measuring their effectiveness: how well they perform specific security jobs, how they improve over time, and how different agents can be objectively compared on the same labor-relevant tasks.

Every agent below is grounded in peer-reviewed research. See the 25+ papers behind the lab — from CAI to CAIBench and G-CTR.

Defenderagent
Bug Bountyagent
Forensicsagent
CLIagent
Social Eng.agent
Networkagent
Red Teamagent
Replay Attackagent
Reportingagent
Retesteragent
SDRagent
Robot Defenderagent
Use Caseagent
APTagent
Customyour agent

Agent heuristics I

The architecture of cybersecurity agents has evolved across four generations — from AI-guided humans (2023) to game-theoretic AI agents (2026) that plan, attack and reason at machine speed.

2023
PentestGPT
~10sPlan (LLM)
Human
Act (tools)
Human
AI-Guided Humans
2025
Cybersecurity AI (CAI)
~10sPlan (LLM)
~60sAct (tools)
Scan & Update
AI Agents (~70s)
2026
G-CTR Analysis
~20sAttack Graph Gen.
<5msNash Equilibrium
G-CTR Results
Game-Theoretic Analysis
2026
G-CTR Guidance
<10msAlgorithmic digest
~28.3sLLM digest
Strategic Interpret.
Game-Theoretic AI Agents (~70s)

Sources: Deng, G., Liu, Y., Mayoral-Vilches, V., et al. (2024). PentestGPT. USENIX Security · Mayoral-Vilches, V., et al. (2025). Cybersecurity AI (CAI). arXiv:2504.06017 · Mayoral-Vilches, V., et al. (2026). A Game-Theoretic AI for Guiding Attack and Defense. arXiv:2601.05887 · Mayoral-Vilches, V., et al. (2026). Towards Cybersecurity Superintelligence. arXiv:2601.14614.  See all papers →

Agent heuristics II

Effectiveness measured on Cybench — 33 CTF challenges, pass@k, 245 minutes max per challenge. Combining heterogeneous agents via Blackboard cross-write beats every single scaffold. Methodology and full numbers in CSI: What's the best harness? (arXiv:2605.28334).

CSI::Claude
15/33
26.8h · $5,122
CSI::Codex
15/33
18.4h · $1,713
CSI::Mistral
10/33
21.9h · $970
CSI::GCAI
10/33
30.4h · $1,279
CSI::CAI
7/33
15.9h · $727
Union
17/33
∪ all scaffolds
Parallel race
17/33
no-comm
Blackboard
19/33
cross-write

Cybench — pass@3, 300 agentic interactions max, 245 minutes max, $40 API expenses max.
References: CSI harness study (arXiv:2605.28334) · CAIBench (arXiv:2510.24317) · Agentic A&D CTF evaluation (arXiv:2510.17521) · World's top CTF agent (arXiv:2512.02654).

One thousand agents.
Iterating, in parallel, on your behalf.

Agents are deployed with select partners to validate and execute security continuously. Talk to us about your threat model.