Play 42
Computer Use Agent
Very High✅ Ready
Vision-based desktop and web automation replacing brittle RPA with screen understanding.
Vision-based desktop and web automation — AI agent controls applications via screenshots and mouse/keyboard, replacing brittle RPA with intelligent screen understanding for legacy systems and cross-app workflows. Runs securely in Azure Container Apps with action replay and rollback capabilities. Ideal for automating legacy enterprise software with no API surface.
Architecture Pattern
Vision-reasoning-action loop: screenshot capture, element detection, deterministic action execution
Azure Services
Azure OpenAI (gpt-4o Vision)Container AppsBlob StorageKey VaultAzure Monitor
DevKit (.github Agentic OS)
- agent.md — root orchestrator with builder→reviewer→tuner handoffs
- 3 agents — Computer Use Builder (gpt-4o), Reviewer (gpt-4o-mini), Tuner (gpt-4o-mini)
- 3 skills — deploy (210 lines), evaluate (168 lines), tune (266 lines)
- 4 prompts — /deploy, /test, /review, /evaluate with agent routing
- .vscode/mcp.json — FrootAI MCP with OpenAI Vision + VM password inputs + envFile
TuneKit (AI Config)
- config/openai.json — gpt-4o vision, temp=0.1
- config/browser.json — resolution, timeouts, action limits
- config/guardrails.json — no credential entry, screenshot redaction
- evaluation/eval.py — Task completion >85%, Error rate <10%
Tuning Parameters
Vision confidence thresholdAction retry limitScreenshot resolutionTimeout per stepRollback policy
Estimated Cost
Dev/Test
$100–200/mo
Production
$3K–8K/mo