Use cases / Mobile AI agent

The shortest path from a sentence to a real iPhone tap

Skip XCUITest, simulators, and JSON-RPC plumbing. Plug a Claude or GPT key, hand the agent a goal, and watch it work the real device.

Start free trial Pricing

Inside the loop

One sentence in, a dozen tool calls out

The agent narrates its plan, dispatches OCR, taps, and screenshots until the goal is met — every call audited end to end.

chat / claude-sonnet-4 running

Find the latest unread DM in WeChat from the marketing channel and summarize it.

Launching WeChat, then I'll scan the chat list for the marketing channel and look for the unread badge.

launch_app

bundle_id: "com.tencent.xin"

→ ok

find_text_on_screen

text: "marketing"

→ match at (210, 384)

tap_by_text

text: "marketing"

describe_screen

→ { "unread": "Q3 plan review at 3pm…" }

The latest message is a meeting reminder for the Q3 plan review at 3pm today — no action required from you.

Architecture

From your editor's chat box to a real iPhone

MCP client speaks stdio, the bridge translates to authenticated HTTPS, the dashboard fans out over a reverse tunnel — all four hops in under 200ms.

architecture

   MCP client      Claude  /  GPT  /  any LLM
        │
        │   tool-call
        ▼
   agentfy-mcp-server     ← 40+ device tools
        │
        │   HTTPS  +  X-API-Key
        ▼
   app.agentfy.io         ← tenant scope, audit log
        │
        │   reverse tunnel
        ▼
   real iPhones           ← 1 device, or 100

Tools

40+

Latency

< 200ms

Per-tenant

isolated

Setup

60s

Tools

40+

Latency

< 200ms

Per-tenant

isolated

Setup

60s

Tool surface

40+ tools, exposed as first-class MCP calls

The same tools your scripts can call — the agent just happens to be the most general consumer.

Device input

What the agent can do on the screen

tap tap_by_text swipe long_press text press_home press_lock

Perception

What the agent can see

screenshot describe_screen find_text_on_screen find_element_on_screen ocr

App control

Lifecycle and deep links

launch_app terminate_app get_foreground_app open_url list_apps

Sub-agents + AI

Hand off the messy bits

ai_takeover ai_solve_captcha ai_extract ai_classify

Network + state

Talk to the outside world

http extract jsonpath set log

Vault + clipboard

Secrets and host-device IO

${vault.X} read_clipboard write_clipboard paste_to_phone

Bring your LLM key, bring an iPhone, start agenting

BYOK on every plan. 40+ MCP tools. 60 seconds to first call.

Start free trial