This browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Which evaluation technique can you use to apply your own judgement about the quality of responses to a set of specific prompts?
Model benchmarks
Manual evaluations
Automated evaluations
Which evaluator compares generated responses to ground truth based on standard metrics?
Coherence
F1 Score
Protected material
Which evaluator metric uses an AI model to judge the structure and logical flow of ideas in a response?
protected material
You must answer all questions before checking your work.
Was this page helpful?