FrootAI — AmpliFAI your AI Ecosystem Get Started

All Solution Plays

Play 23

Browser Automation

High Ready

AI-driven web navigation using vision + Playwright MCP.

Uses AI to navigate websites, fill forms, extract data, take screenshots, and execute multi-step web workflows — entirely driven by natural language instructions. Combines Playwright MCP Server for browser control (navigate, click, type, screenshot), GPT-4o Vision for understanding page content and making navigation decisions, and structured task planning for breaking complex web tasks into executable steps. Domain allowlist prevents arbitrary browsing.

Architecture Pattern

Browser automation: vision model + Playwright, task planning, domain-restricted

Azure Services

Azure OpenAI (gpt-4o Vision)Container AppsPlaywright MCP Server

DevKit (.github Agentic OS)

  • agent.md — root orchestrator with builder→reviewer→tuner handoffs
  • 3 agents — Browser Builder (gpt-4o), Reviewer (gpt-4o-mini), Tuner (gpt-4o-mini)
  • 3 skills — deploy (102 lines), evaluate (100 lines), tune (103 lines)
  • 4 prompts — /deploy, /test, /review, /evaluate with agent routing
  • .vscode/mcp.json — FrootAI MCP with OpenAI + target URL inputs + envFile

TuneKit (AI Config)

  • config/openai.json — gpt-4o vision model, temp=0.1
  • config/browser.json — domain allowlist, timeouts, viewport config
  • config/guardrails.json — no credential entry, screenshot PII redaction
  • evaluation/eval.py — Task completion >85%, Error rate <10%

Tuning Parameters

Domain allowlistVision prompts for page understandingAction timeout per stepRetry config on navigation failureMax navigation depthScreenshot resolution

Estimated Cost

Dev/Test

$100–200/mo

Production

$1K–3K/mo