langgraph
is a library for building stateful, multi-actor applications with LLMs, used to create agent and multi-agent workflows. Evaluating langgraph
graphs can be challenging because a single invocation can involve many LLM calls, and which LLM calls are made may depend on the outputs of preceding calls. In this guide we will focus on the mechanics of how to pass graphs and graph nodes to evaluate()
/ aevaluate()
. For evaluation techniques and best practices when building agents head to the langgraph docs.
langsmith>=0.2.0
evaluate
or aevaluate
. If any of you nodes are defined as async, you’ll need to use aevaluate
langsmith>=0.2.0
langgraph
is that the output of a graph is a state object that often already carries information about the intermediate steps taken. Usually we can evaluate whatever we’re interested in just by looking at the messages in our state. For example, we can look at the messages to assert that the model invoked the ‘search’ tool upon as a first step.
Requires langsmith>=0.2.0
langgraph
makes it easy to do this. In this case we can even continue using the evaluators we’ve been using.
Click to see a consolidated code snippet