Play 95
Multimodal Search Engine v2
Very High✅ Ready
Unified search across images, text, code, and audio with cross-modal reasoning.
Unified search across images, text, code, and audio with cross-modal reasoning. A user can search by uploading an image and get related code snippets, documentation, and audio explanations. Combines Azure AI Search vector indexes across modalities with GPT-4o for cross-modal synthesis.
Architecture Pattern
Cross-modal search: query decomposition - modality-specific indexing - vector fusion - cross-modal synthesis - relevance feedback
Azure Services
Azure AI SearchAzure AI VisionAzure AI SpeechAzure OpenAIAzure Container Apps
DevKit (.github Agentic OS)
- agent.md — root orchestrator with builder→reviewer→tuner handoffs
- 3 agents — Multimodal Builder (gpt-4o), Reviewer (gpt-4o-mini), Tuner (gpt-4o-mini)
- 3 skills — deploy (254 lines), evaluate (101 lines), tune (227 lines)
- 4 prompts — /deploy, /test, /review, /evaluate with agent routing
- .vscode/mcp.json — FrootAI MCP with OpenAI + AI Search inputs + envFile
TuneKit (AI Config)
- config/openai.json - cross-modal synthesis and query expansion prompts
- config/search.json - fusion weights, index configs, diversity scores
- config/guardrails.json - relevance minimums, latency budgets
- evaluation/eval.py - Cross-modal NDCG >0.75, Latency <500ms
Tuning Parameters
Cross-modal fusion weightsModality-specific index configResult diversity scoreQuery expansion depthRelevance feedback loop
Estimated Cost
Dev/Test
$120-300/mo
Production
$4K-12K/mo