Quality scores, evaluation thresholds, and WAF alignment status per solution play.
| Play | Groundedness | Relevance | Coherence | Fluency | Status |
|---|---|---|---|---|---|
| 01 — Enterprise RAG | 4.2 | 4.1 | 4.3 | 4.5 | ✅ Evaluated |
| 02 — AI Landing Zone | N/A | N/A | N/A | N/A | ✅ Evaluated |
| 03 — Deterministic Agent | 4.5 | 4.3 | 4.6 | 4.4 | ✅ Evaluated |
| 04 — Play 4 | — | — | — | — | ⏳ Skeleton |
| 05 — Play 5 | — | — | — | — | ⏳ Skeleton |
| 06 — Play 6 | — | — | — | — | ⏳ Skeleton |
| 07 — Play 7 | — | — | — | — | ⏳ Skeleton |
| 08 — Play 8 | — | — | — | — | ⏳ Skeleton |
| 09 — Play 9 | — | — | — | — | ⏳ Skeleton |
| 10 — Play 10 | — | — | — | — | ⏳ Skeleton |
| 11 — Play 11 | — | — | — | — | ⏳ Skeleton |
| 12 — Play 12 | — | — | — | — | ⏳ Skeleton |
| 13 — Play 13 | — | — | — | — | ⏳ Skeleton |
| 14 — Play 14 | — | — | — | — | ⏳ Skeleton |
| 15 — Play 15 | — | — | — | — | ⏳ Skeleton |
| 16 — Play 16 | — | — | — | — | ⏳ Skeleton |
| 17 — Play 17 | — | — | — | — | ⏳ Skeleton |
| 18 — Play 18 | — | — | — | — | ⏳ Skeleton |
| 19 — Play 19 | — | — | — | — | ⏳ Skeleton |
| 20 — Play 20 | — | — | — | — | ⏳ Skeleton |
npx frootai validate --waf
WAF scorecard: 6 pillars, 17 checks
Ctrl+Shift+P → FrootAI: Run Evaluation
Visual dashboard in VS Code panel
python evaluation/eval.py
Score against golden dataset