Prompt A/B testing and evaluation

Day 22 of 30 · 30 Days of AI

Compare prompts and pick the best performer

Learning goal

Create two prompt variants (A/B).
Define 3–4 evaluation criteria.
Select the better prompt with evidence.

Why it matters

Small changes in prompts can change quality.
Criteria-driven eval reduces guesswork.

Explanation

Make variants: change format, constraints, role.
Criteria: accuracy, clarity, brevity, actionability.
Ask the model to self-score then you verify.

Examples

Prompt: “Evaluate A vs B on accuracy/clarity/brevity/actionability; score 1–5; explain.”
Weak: “Which is better?”

Guided exercise (10–15 min)

Pick a task; write prompt A and B.
Generate outputs; score with criteria; pick winner.

Independent exercise (5–10 min)

Tweak the weaker prompt and retest.

Self-check

A/B outputs generated.
Scores per criterion.
Winner chosen with reason.

Optional deepening

Evaluation prompting: prompt guide