max_concurrency
/maxConcurrency
arg when running large jobs. This parallelizes evaluation by effectively splitting the dataset across threads.langsmith>=0.3.13
langsmith>=0.3.13
example.inputs
field of each Example is what gets passed to the target function. In this case our toxicity_classifier
is already set up to take in example inputs so we can use it directly.data
- the name OR UUID of the LangSmith dataset to evaluate on, or an iterator of examplesevaluators
- a list of evaluators to score the outputs of the functionlangsmith>=0.3.13
evaluate()
creates an Experiment which can be viewed in the LangSmith UI or queried via the SDK. Evaluation scores are stored against each actual output as feedback.
If you’ve annotated your code for tracing, you can open the trace of each row in a side panel view.
Click to see a consolidated code snippet
langsmith>=0.3.13