Autonomous Bug Bounty Agent(with Scope-Enforcing Proxy + PoC Validator)
Status: Private beta / Early access
Focus: Authorized, in-scope security testing (VDP / Bug Bounty, black-box)
We’re three security researchers based in Tokyo building an autonomous agent framework that can map an application, plan targeted security hypotheses, and produce a human-reviewable report while enforcing strict safety constraints so it can’t wander out of scope.
There’s no public repo yet; this page shares architecture and learnings for feedback.
TL;DR
- Multi-agent workflow: recon → hypothesis planning → class-specific testing → validation → report drafting.
- All traffic passes through a scope-enforcing proxy (allowlist + rate/concurrency caps + logging).
- Real-world validation (Feb 8, 2026): running on ~5 targets/week since late 2025.
- U.S. Dept of Defense (DoD): 3 vulnerabilities triaged.
- HackerOne ranking: reached #86 globally in VDP (90 Days) leaderboard.
- Bug Bounty Programs: 2 duplicates, 1 under review.
- Benchmarks: solved 84% of PortSwigger Web Security Academy labs autonomously.


What this is / isn’t
✅ This is
- An autonomous testing engine for authorized scopes with human approval before submission.
- A precision-focused system that validates findings and only leaves final approval/report submission to a human.
❌ This is not
- A fully autonomous submit-to-bounty bot.
- A general internet crawler or exploitation toolkit.
- A replacement for structured, coverage-driven pentest methodology (yet).
The Architecture
The workflow mimics human red-team methodology while maintaining hard safety controls. Final submission is always decided by a human.
- Input: Target URL and optional credentials (for grey-box testing)
- Output: Drafted report for human review
Architecture diagram (Mermaid)
Rendering diagram…
Architecture (overview)
1) Initial Recon Agent:enumerates reachable endpoints in-scope, infers technology patterns, and builds an attack-surface map.
2) Coordinator:selects hypotheses, delegates to specialized agents, and manages budgets/rate limits/retries/stop conditions.
3) Specialized testing agents:focused workers (IDOR/SQLi/XSS) reduce hallucinations and apply class-specific evidence heuristics.
4) Validator + Report Drafting:replays key requests, runs negative checks, collects artifacts, and emits draft reports for human review.
Execution environment
- Python runtime (parsing, diffing, state handling)
- Headless browser (DOM rendering, JS-driven flows)
- Kali Linux shell (recon utilities, HTTP tooling, parsers)
- All traffic routed through the scope-enforcing proxy
Safety model and guardrails
Safety is a hard constraint. This system is intended only for authorized testing.
Scope-Enforcing Proxy
- Allowlist controls: FQDN/method constraints and optional headers
- Throttling: max RPS and concurrency caps
- Auditing: full allow/deny logging and reproducible traces
- Default-deny: ambiguous requests are blocked
Safe PoC policy
- Prioritizes read-only verification patterns
- Avoids destructive payloads and persistence attempts
- Stops on instability or side-effect risk signals
We do not publish exploit payloads or step-by-step compromise guidance.
Experimental Results (As of Feb 8, 2026)
- Running against ~5 targets/week since late 2025.
- VDP success: #86 globally on HackerOne VDP (90 Days), with 3 DoD vulnerabilities triaged.
- BBP challenges: 2 submissions closed as duplicates, 1 report under review.
- Key learning: impact gap between technical correctness and business criticality.
Performance (one representative run)
- Wall time: ~2 hours
- Model/API cost: low single-digit USD (varies by conditions)
- Human time: review, verification, and final editing
Optimization priorities: high precision, strong evidence trails, strict scope adherence.
Limitations / current challenges
- SPA-heavy targets still degrade performance due to deeper browser-state modeling demands.
- Context growth can cause inefficient behavior or rare loops; mitigated via budgets, stop conditions, and summarization.
- Coverage and reproducibility vary by exploration path, timing, and defenses.
Ethics
- Authorized testing only within explicit VDP / bounty scopes.
- Human-in-the-loop; no automatic submissions.
- Scope enforcement via proxy and default-deny rules.
- No harmful payload sharing.
Contact / Disclosure
Open to collaboration and feedback from teams building similar systems.
Email: info@layer8.jp
