AI-solved CAPTCHAs, built into the macro DSL
Four CAPTCHA kinds out of the box. Your own LLM key drives the vision model. Embeds as a single step in any macro.
Built in, on your key, in your audit log
Per-solve pricing scales poorly; an external service can't be audited. ai_solve_captcha runs inside your tenant on inference you already pay for.
Pay per solve, slow, brittle, audit-opaque
- Send the screenshot to an external server, wait 10–30 seconds for a result
- Per-solve pricing — runaway costs once you scale to fleets
- Adds an external dependency you can't see inside or audit
- Specific to one or two CAPTCHA shapes; new variants break silently
Vision model on your own key, in your own audit log
- One macro step, runs entirely on your Anthropic / OpenAI / OpenRouter key
- Covers 4 CAPTCHA types: image_selection, slider, click_in_order, checkbox_terms
- Each solve lands in agent_sub_runs with model name and token usage
- Retry budget + exclusion set — failed picks aren't tried again
Four CAPTCHA kinds, one solver step
The auto-classifier picks the right strategy from the frame; you can also pin a kind explicitly for faster, cheaper runs.
Image selection
ai_solve_captcha kind=image_selection describe_screen tap_pixel Slider
ai_solve_captcha kind=slider swipe find_element_on_screen Click in order
ai_solve_captcha kind=click_in_order find_text_on_screen tap_by_text Terms checkbox
ai_solve_captcha kind=checkbox_terms tap_by_text What ai_solve_captcha looks like at runtime
The agent narrates, calls the solver tool with a 100-second budget, and reports the result with classifier output for audit.
text: "Security Verification"
auto, timeout=100s
text: "Log in" exact
Embed in any macro as a single step
Wrap it in try/catch for graceful fallback. Failed solves leave a clean audit trail and halt the macro deterministically — no orphan sessions.
# Login flow with a CAPTCHA escape hatch
tap "Log in" exact
if visible "Security Verification" timeout=5s {
try {
ai_solve_captcha -> captcha timeout=100s
tap "Log in" exact
} catch {
# Bail out of the macro cleanly so the runner can retry
log "captcha unsolved — terminating"
terminate_app "com.example.app"
halt
}
}
Combine with…
Stop paying per-solve for CAPTCHAs
BYOK your LLM key — solve as many as your inference budget allows. Drop into existing macros without a rewrite.
Start free trial