Play 44

Foundry Local On-Device

High✅ Ready

On-device LLM inference for air-gapped environments with cloud escalation.

On-device LLM inference for air-gapped and data-sovereign environments — local model handles routine queries, cloud escalates for complex reasoning, with automatic fallback, sync, and fleet management via IoT Hub. Supports disconnected operation with queued sync. Ideal for manufacturing floors, field operations, and government classified environments.

Architecture Pattern

Hybrid local-cloud inference: confidence-based escalation, offline queue, fleet sync

Azure Services

Azure OpenAIAzure IoT HubAzure MonitorKey Vault

DevKit (.github Agentic OS)

agent.md — root orchestrator with builder→reviewer→tuner handoffs
3 agents — Foundry Local Builder (gpt-4o), Reviewer (gpt-4o-mini), Tuner (gpt-4o-mini)
3 skills — deploy (246 lines), evaluate (178 lines), tune (233 lines)
4 prompts — /deploy, /test, /review, /evaluate with agent routing
.vscode/mcp.json — FrootAI MCP with OpenAI key + cache path inputs + envFile

TuneKit (AI Config)

config/openai.json — gpt-4o for cloud, local model config
config/edge.json — escalation threshold, sync interval, memory budget
config/guardrails.json — model validation, inference safety
evaluation/eval.py — Local accuracy >80%, Escalation rate <20%

Tuning Parameters

Local model thresholdCloud escalation confidenceSync intervalDevice memory budgetQueue retention policy

Estimated Cost

Dev/Test

$50–120/mo

Production

$1K–4K/mo

User Guide Open in VS Code View on GitHub Setup Guide Configurator Ask Agent FAI Back to FrootAI