AI Testing Agents That Find Bugs for AI Coding Agents
Generate, execute, and report on test cases using real browser automation and multi-provider AI analysis. Built as an MCP plugin for Claude Code.
Everything you need for AI-driven web testing, built into your Claude workflow.
AI analyzes page screenshots and DOM to generate comprehensive test suites with steps, validation conditions, and priority scoring — automatically.
Tests run in actual Chromium via Puppeteer. Natural language steps are converted to browser actions — clicks, typing, scrolling — with retry logic.
7 specialized AI testers analyze your page in parallel: general, UI/UX, security, privacy, accessibility, content, and mobile.
Full accessibility audit against WCAG Level A, AA, or AAA. Detects missing alt text, contrast issues, ARIA problems, keyboard navigation gaps.
After code changes, the plugin generates targeted tests for your modifications — including regression tests and edge cases — then runs them automatically.
Every test run produces a detailed HTML report with AI reasoning, screenshots at each step, timing data, and pass/fail results with a dark-themed timeline UI.
BFS-based crawler discovers pages across your site, generates tests for each, and deduplicates — giving you comprehensive coverage with one command.
Export tests to TestRail CSV, Cucumber/Gherkin, Selenium Python scripts, or JSON. Ready for your existing CI/CD and test management workflows.
Launch a visible browser, interact naturally, and capture every click, keystroke, and scroll as replayable test actions. Record once, replay everywhere.
Add, edit, update, and delete test cases by chatting. No need to leave the conversation — just say "add a test for login".
7 specialized AI testers analyze your page independently — each giving their own verdict with confidence scores.
Every test run is recorded per URL with pass/fail counts, timestamps, and report paths. See quality trends over time.
7 specialized AI testers analyze every page independently — like having a full QA team in your chat.
Functional bugs, broken flows, logic errors
Layout, alignment, responsiveness, usability
XSS, injection, exposed data, misconfigs
Cookie consent, tracking, data exposure
WCAG violations, ARIA, contrast, navigation
Typos, broken links, missing content, SEO
Touch targets, viewport, responsive issues
━━━ Testers.AI Bug Detection ━━━━━━━━━━━━━━━ 🌐 https://example.com 🤖 openai | 7 specialized testers ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ── Tester Panel ── 🔍 General: ✓ PASS 🎨 UX: ✗ 2 ISSUES FOUND • Inconsistent button sizes [P2 / 85%] • Missing hover state [P3 / 78%] 🔒 Security: ✓ PASS 🛡 Privacy: ✗ 1 ISSUE FOUND • No cookie consent banner [P1 / 92%] ♿ Accessibility: ✗ 3 ISSUES FOUND • Missing alt text [P1 / 95%] • Low contrast ratio [P2 / 88%] • Inputs missing labels [P1 / 91%] 📝 Content: ✓ PASS 📱 Mobile: ✓ PASS ━━━ Total bugs: 6 | Categories: 7 ━━━━━━━━ $
Get up and running in under 2 minutes.
Clone the repo and install packages.
cd mcp-server
npm install
npm run build
Add to your Claude config:
{
"mcpServers": {
"testing": {
"command": "node",
"args": ["/path/to/out/index.js"]
}
}
}
Tell Claude to configure the plugin:
Configure testing plugin
to use OpenAI with
API key sk-...
Just ask Claude to test a page:
Quick test google.com
Detect bugs on
https://example.com
Quick shortcuts available in Claude Code and as MCP prompts in Claude Desktop.
Choose the AI provider that works best for your needs. Switch at any time.
gpt-5-mini-2025-08-07
Fast, cost-effective. Great for high-volume test generation and quick bug scans.
claude-haiku-4-5
Precise reasoning. Excellent for detailed verification and nuanced bug detection.
gemini-3.1-flash-lite-preview
Strong multimodal vision. Good for visual regression and UI analysis.
How the pieces fit together.
Common patterns for using the testing plugin.
# After deploy, in Claude:
"Quick test
https://staging.myapp.com"
# Or for specific flows:
"Quick test staging.myapp.com
-- complete checkout
with test card"
# In Claude Code after coding:
/test-changes
http://localhost:3000
# Auto-reads git diff,
# generates targeted tests,
# runs them
# Crawl + generate for site:
"Crawl and generate tests
for https://myapp.com
starting from homepage,
max 20 pages"
# WCAG AAA audit:
"Run accessibility audit on
https://myapp.com
at AAA level"
# Returns WCAG violations
# Generate then export:
"Generate tests for myapp.com
then export them as
TestRail CSV"
# Also: cucumber, selenium
# Focused security scan:
"Detect bugs on
https://myapp.com
focusing on security and
privacy categories only"