n8n/packages/@n8n/instance-ai/evaluations/computer-use/data/M.2-no-cu-when-unnecessary.json
Bernhard Wittmann b445221c6a
feat: Computer-use evaluation harness (no-changelog) (#29797)
Co-authored-by: Elias Meire <elias@meire.dev>
2026-05-12 08:36:12 +00:00

13 lines
444 B
JSON

{
"id": "M.2-no-cu-when-unnecessary",
"category": "meta",
"prompt": "Build me a workflow that sends a Slack message every morning at 9am.",
"budgets": { "maxToolCalls": 30, "maxDurationMs": 300000 },
"graders": [
{ "type": "trace.mustNotCallMcpServer", "server": "computer-use" },
{ "type": "trace.budget", "maxToolCalls": 30 },
{ "type": "trace.mustNotLoop", "maxRepeatedCall": 3 }
],
"tags": ["meta", "proposal", "regression"]
}