Sometimes it is useful for a custom evaluator or summary evaluator to return multiple metrics. For example, if you have multiple metrics being generated by an LLM judge, you can save time and money by making a single LLM call that generates multiple metrics instead of making multiple LLM calls.To return multiple scores using the Python SDK, simply return a list of dictionaries/objects of the following form:
Copy
Ask AI
[ # 'key' is the metric name # 'score' is the value of a numerical metric {"key": string, "score": number}, # 'value' is the value of a categorical metric {"key": string, "value": string}, ... # You may log as many as you wish]
To do so with the JS/TS SDK, return an object with a ‘results’ key and then a list of the above form
Copy
Ask AI
{results: [{ key: string, score: number }, ...]};
Each of these dictionaries can contain any or all of the feedback fields; check out the linked document for more information.Example: