José Braulio González Valido
30d9a168bc
feat(ai-builder): Add --prebuilt-workflows flag for eval CLI (no-changelog) ( #29830 )
...
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 11:47:29 +00:00
Mutasem Aldmour
1cb7c591b3
chore: Match production builder step cap in pairwise eval ( #29977 )
...
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 09:53:36 +00:00
Bernhard Wittmann
68560fbb9a
refactor: Extract shared eval helpers (no-changelog) ( #29800 )
2026-05-07 08:05:01 +00:00
José Braulio González Valido
2164afc5df
chore(ai-builder): Improve eval comparison alert clarity (no-changelog) ( #29929 )
...
CI: Master (Build, Test, Lint) / Build for Github Cache (push) Waiting to run
CI: Master (Build, Test, Lint) / Unit tests (22.x) (push) Waiting to run
CI: Master (Build, Test, Lint) / Unit tests (24.14.1) (push) Waiting to run
CI: Master (Build, Test, Lint) / Unit tests (25.x) (push) Waiting to run
CI: Master (Build, Test, Lint) / Lint (push) Waiting to run
CI: Master (Build, Test, Lint) / Performance (push) Waiting to run
CI: Master (Build, Test, Lint) / Notify Slack on failure (push) Blocked by required conditions
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 21:20:49 +00:00
Albert Alises
869dc32c15
feat(ai-builder): Speeds up Instance AI eval by parallelizing iterations and trimming mock handler (no-changelog) ( #29839 )
2026-05-06 08:15:33 +00:00
José Braulio González Valido
bbe3e2d148
feat(ai-builder): Add per-PR eval regression detection vs LangSmith baseline ( #29456 )
...
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 08:15:08 +00:00
Mutasem Aldmour
fdceec21b9
feat: Add pairwise workflow eval pipeline ( #29123 )
...
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Jaakko Husso <jaakko@n8n.io>
2026-05-04 13:26:27 +00:00
Charlie Kolb
f775604c25
refactor: Split up instance-ai confirmation endpoint DTO by action ( #29179 )
2026-05-04 10:47:38 +00:00
Albert Alises
c28d501ba1
fix(ai-builder): Stop builder from adding auth to inbound trigger nodes by default ( #29648 )
2026-05-04 10:25:17 +00:00
Albert Alises
625ed5e95a
fix(core): Harden Set node workflow SDK contract ( #29568 )
2026-04-30 12:10:44 +00:00
José Braulio González Valido
e7f3e6f771
feat(ai-builder): Add three new workflow eval test cases (no-changelog) ( #29351 )
...
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 08:11:04 +00:00
Albert Alises
139b803dae
fix: Use explicit node references for AI memory session keys ( #29473 )
2026-04-30 07:26:36 +00:00
José Braulio González Valido
4fd68bfc99
ci(ai-builder): Parallelize Instance AI eval CI across multiple n8n containers ( #29545 )
...
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 07:22:28 +00:00
José Braulio González Valido
54d9286d92
fix(ai-builder): Filter LangSmith eval dataset by local file slugs ( #29507 )
...
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 13:30:29 +00:00
José Braulio González Valido
e503587854
docs: Restructure instanceAI evals README for first-time users (no-changelog) ( #29095 )
...
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 07:22:45 +00:00
José Braulio González Valido
29cdd011b0
fix(ai-builder): Honor --timeout-ms across the eval harness (no-changelog) ( #29219 )
...
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 12:49:01 +00:00
José Braulio González Valido
250b718936
fix(ai-builder): Stop tombstoning example UUIDs in LangSmith dataset sync (no-changelog) ( #29112 )
...
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 08:52:17 +00:00
José Braulio González Valido
540faa7c1d
fix(ai-builder): Update eval dataset prompts to match scenario checklists (no-changelog) ( #29011 )
...
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 09:55:18 +00:00
Benjamin Schroth
c961849226
feat(ai-builder): Add sub-agent evaluation harness with binary checks (no-changelog) ( #28289 )
...
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-24 07:50:46 +00:00
Albert Alises
8b33424d0f
fix(ai-builder): Stop treating empty defaults as satisfying required for the Split node ( #28978 )
2026-04-23 16:22:19 +00:00
José Braulio González Valido
89e9117d39
fix(ai-builder): Expose outputCount when eval node output is truncated (no-changelog) ( #28977 )
...
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 11:06:27 +00:00
José Braulio González Valido
16e5f9572f
feat(ai-builder): Add LangSmith integration for workflow eval tracking (no-changelog) ( #28835 )
...
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 08:47:02 +00:00
Luca Mattiazzi
714981eea3
feat: Add multiple runs to instanceAI eval (no-changelog) ( #28493 )
...
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
2026-04-22 07:11:10 +00:00
José Braulio González Valido
560f300716
test: Add Instance AI workflow evals CI pipeline (no-changelog) ( #28366 )
...
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-20 14:15:41 +00:00
José Braulio González Valido
ac922fa38c
feat(ai-builder): Improve eval verifier and mock handler reliability (no-changelog) ( #28255 )
...
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 13:57:32 +00:00
José Braulio González Valido
91b01d27b9
feat(ai-builder): Fix IF/Switch/Filter node misconfiguration in builder (no-changelog) ( #28172 )
...
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 14:35:43 +00:00
José Braulio González Valido
fef91c97dd
feat(ai-builder): Add --keep-workflows flag and fix eval execution errors (no-changelog) ( #28129 )
...
Build: Benchmark Image / build (push) Waiting to run
CI: Master (Build, Test, Lint) / Build for Github Cache (push) Waiting to run
CI: Master (Build, Test, Lint) / Unit tests (22.x) (push) Waiting to run
CI: Master (Build, Test, Lint) / Unit tests (24.14.1) (push) Waiting to run
CI: Master (Build, Test, Lint) / Unit tests (25.x) (push) Waiting to run
CI: Master (Build, Test, Lint) / Lint (push) Waiting to run
CI: Master (Build, Test, Lint) / Performance (push) Waiting to run
CI: Master (Build, Test, Lint) / Notify Slack on failure (push) Blocked by required conditions
Util: Sync API Docs / sync-public-api (push) Waiting to run
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 17:35:04 +00:00
José Braulio González Valido
2383749980
feat(ai-builder): Workflow evaluation framework with LLM mock execution ( #27818 )
...
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Arvin A <51036481+DeveloperTheExplorer@users.noreply.github.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
2026-04-07 13:31:16 +00:00