rn-mcp CLI (Snapshot + Refs)

A shell-first interface for AI agents to control React Native apps. Uses the Snapshot + Refs pattern inspired by agent-browser for token-efficient interaction.

Why CLI?

MCP Toolsrn-mcp CLI
Token costHigh — full JSON responses per tool callLow — compact refs (@e1, @e2)
Steps to tapquery_selector → extract coords → tap (3 calls)rn-mcp tap @e3 (1 call)
SetupMCP client config requiredShell command, no config
Best forCursor, Windsurf (MCP-only editors)Claude Code, Codex, shell scripts

Installation

The CLI is included in the same package:

# Global install
npm install -g @ohah/react-native-mcp-server

# Or use npx
npx rn-mcp --help

# Or project-local
npm install -D @ohah/react-native-mcp-server
npx rn-mcp --help

Prerequisites

  • MCP server must be running (started by your editor or npx react-native-mcp-server)
  • App must be running on a simulator/emulator and connected via WebSocket (port 12300)
  • iOS: idb installed
  • Android: adb installed

Workflow

# 1. Check connection
rn-mcp status

# 2. Get interactive elements as @refs
rn-mcp snapshot -i

# 3. Interact using @refs
rn-mcp tap @e3
rn-mcp type @e5 "user@example.com"

# 4. After screen transition, refresh refs
rn-mcp snapshot -i

Snapshot output example

@e1   View #login-screen
@e2     TextInput #email "Email"
@e3     TextInput #password "Password"
@e4     Pressable #login-btn "Sign In"
@e5     Pressable #signup-link "Create Account"

Each element gets a short ref (@e1, @e2, ...) assigned in depth-first order.

Commands

Connection

CommandDescription
rn-mcp statusShow connection status and devices

Snapshot

CommandDescription
rn-mcp snapshotFull component tree with @refs
rn-mcp snapshot -iInteractive elements only (recommended)
rn-mcp snapshot --max-depth 10Limit tree depth (default: 30)
rn-mcp snapshot -i --jsonJSON output for scripting

Interaction

CommandDescription
rn-mcp tap @e3Tap element by ref
rn-mcp tap "#login-btn"Tap by selector
rn-mcp tap @e3 --long 500Long press (500ms)
rn-mcp type @e5 "text"Type into TextInput
rn-mcp swipe @e2 downSwipe element
rn-mcp swipe @e2 down --dist 300Swipe with distance (dp)
rn-mcp key backPress hardware key

Available keys: back, home, enter, tab, delete, up, down, left, right

Assertions

CommandDescription
rn-mcp assert text "Welcome"Verify text exists (exit 0/1)
rn-mcp assert visible @e3Verify element is visible
rn-mcp assert not-visible @e3Verify element is NOT visible
rn-mcp assert count "Pressable" 5Verify element count

Query

CommandDescription
rn-mcp query "#my-btn"Query single element info
rn-mcp query --all "Pressable"Query all matching elements

Screenshot

CommandDescription
rn-mcp screenshotSave screenshot (default: screenshot.png)
rn-mcp screenshot -o login.pngSave to specific file

Agent Guide Setup

CommandDescription
rn-mcp init-agentAdd CLI guide to AGENTS.md + CLAUDE.md
rn-mcp init-agent --lang koKorean guide
rn-mcp init-agent --target claudeCLAUDE.md only

Global Options

-d, --device <id>        Target device (when multiple connected)
-p, --platform <os>      ios or android
--port <n>               WebSocket port (default: 12300)
--json                   JSON output for scripting
--timeout <ms>           Command timeout (default: 10000)
-h, --help               Show help
-v, --version            Show version

Refs System

How refs work

  1. rn-mcp snapshot -i assigns @e1, @e2, ... to each element (depth-first order)
  2. Refs are saved to ~/.rn-mcp/session.json
  3. Subsequent commands use refs: rn-mcp tap @e3
  4. Running snapshot again invalidates all previous refs

When to re-snapshot

  • After screen transitions (navigation, modal open/close)
  • When @ref not found error occurs
  • After actions that change the UI structure

Selectors (alternative to refs)

You can also use selectors directly without snapshot:

rn-mcp tap "#login-btn"                    # by testID
rn-mcp tap "Pressable:text(\"Sign In\")"   # by type + text
rn-mcp tap "TextInput[placeholder=\"Email\"]"  # by attribute

Important Notes

  • iOS orientation is handled automatically — no manual action needed
  • Android dp→px conversion is automatic
  • Coordinates are in points (dp), not pixels
  • Use --json flag for programmatic output parsing

Example: Login Flow

# Check connection
rn-mcp status

# See what's on screen
rn-mcp snapshot -i
# @e1   TextInput #email "Email"
# @e2   TextInput #password "Password"
# @e3   Pressable #login-btn "Sign In"

# Fill form and submit
rn-mcp type @e1 "test@example.com"
rn-mcp type @e2 "password123"
rn-mcp tap @e3

# Verify navigation
rn-mcp assert text "Dashboard"

# Get new screen's elements
rn-mcp snapshot -i