FrootAI — AmpliFAI your AI Ecosystem Get Started

Quality Metrics

FAI Evaluation Dashboard

Automated quality scoring for every solution play. These metrics run in CI and must pass before any play ships.

Groundedness

≥ 0.95

% of claims backed by source documents. Measured via citation verification.

Coherence

≥ 0.90

Logical flow and consistency of multi-turn responses.

Relevance

≥ 0.90

How well the response addresses the user's actual question.

Fluency

≥ 0.95

Grammatical correctness and natural language quality.

Safety

0 violations

Content safety score — harmful, hateful, sexual, violent content blocked.

Cost / Query

< $0.01

Average token cost per query including retrieval + generation.

Evaluation Pipeline

  1. Test Set — 50+ question/answer pairs per play, covering edge cases
  2. Runpython evaluation/eval.py scores each metric
  3. Gate — CI blocks deployment if any metric falls below threshold
  4. Report — Results saved to evaluation/results.json