Arvin A
|
55d8b59a48
|
feat(core): Stream tool calls and ship M3 fixtures from LLM eval wire server (no-changelog) (#30983)
|
2026-05-27 14:53:43 +01:00 |
|
José Braulio González Valido
|
76c432c53f
|
fix(ai-builder): Default Switch to case-insensitive in builder hints (#31044)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-05-27 07:42:54 +00:00 |
|
Albert Alises
|
8bb5db3bbd
|
feat(core): Add runtime skills to Instance AI builders (no-changelog) (#30838)
|
2026-05-27 06:54:40 +00:00 |
|
Albert Alises
|
959f8ca53c
|
refactor(core): Remove web researcher sub-agent (no-changelog) (#31141)
|
2026-05-26 17:25:50 +00:00 |
|
José Braulio González Valido
|
700b32237f
|
feat(ai-builder): Surface WHAT-dimension binary checks per built workflow (no-changelog) (#30932)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-05-26 12:18:52 +01:00 |
|
José Braulio González Valido
|
96a9521394
|
ci: Use PR head ref in eval experiment names (no-changelog) (#30898)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-05-22 11:06:50 +00:00 |
|
José Braulio González Valido
|
ada126d9b7
|
test(ai-builder): Validate user-proxy tool outputs against api-types schemas (no-changelog) (#30905)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-05-21 14:15:44 +00:00 |
|
José Braulio González Valido
|
81ea56fa6b
|
test(ai-builder): Add multi-turn capability for IAI evals (no-changelog) (#30586)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-05-21 13:03:35 +00:00 |
|
José Braulio González Valido
|
e9b1c7c48f
|
chore(ai-builder): Tag Daytona sandboxes with name prefix and labels (no-changelog) (#30697)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-05-21 12:43:36 +00:00 |
|
Bernhard Wittmann
|
374e7ed0b2
|
ci: Fail Instance AI discovery evals only on zero-pass scenarios (no-changelog) (#30816)
|
2026-05-21 06:44:18 +00:00 |
|
oleg
|
d7d2cc1442
|
feat(core): Add native agent substrate (no-changelog) (#30015)
Signed-off-by: Oleg Ivaniv <me@olegivaniv.com>
|
2026-05-19 13:48:45 +00:00 |
|
Albert Alises
|
17b64da0f0
|
fix: Add Switch fallback output guidance for workflow builder (#30449)
|
2026-05-15 07:33:25 +00:00 |
|
Albert Alises
|
f1fd79f830
|
fix(core): Avoid unnecessary planner credential prompts (#30451)
|
2026-05-14 15:36:11 +00:00 |
|
José Braulio González Valido
|
9fcd5c5864
|
feat(ai-builder): Tool-driven eval mock handler with per-API quirks registry (#30135)
CI: Master (Build, Test, Lint) / Build for Github Cache (push) Waiting to run
CI: Master (Build, Test, Lint) / Unit tests (22.x) (push) Waiting to run
CI: Master (Build, Test, Lint) / Unit tests (24.14.1) (push) Waiting to run
CI: Master (Build, Test, Lint) / Unit tests (25.x) (push) Waiting to run
CI: Master (Build, Test, Lint) / Lint (push) Waiting to run
CI: Master (Build, Test, Lint) / Performance (push) Waiting to run
CI: Master (Build, Test, Lint) / Notify Slack on failure (push) Blocked by required conditions
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-05-14 14:29:10 +00:00 |
|
Albert Alises
|
c3e39f8504
|
fix(ai-builder): Guide builder to prefer httpBearerAuth for Bearer flows (#30309)
|
2026-05-13 17:24:56 +00:00 |
|
Mutasem Aldmour
|
2fd54d8230
|
feat(core): Curate workflow examples for the builder sandbox (#30025)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-05-13 06:45:39 +00:00 |
|
Bernhard Wittmann
|
c9ecef2a36
|
test: Add discovery eval guardrails for browser/computer-use dispatch (no-changelog) (#30236)
|
2026-05-12 14:29:35 +00:00 |
|
Milorad FIlipović
|
54d62bb4a1
|
fix(core): Update instance-ai evaluator to include pinned subnodes and allow all mcp tools (#30292)
|
2026-05-12 09:13:01 +00:00 |
|
Bernhard Wittmann
|
b445221c6a
|
feat: Computer-use evaluation harness (no-changelog) (#29797)
Co-authored-by: Elias Meire <elias@meire.dev>
|
2026-05-12 08:36:12 +00:00 |
|
José Braulio González Valido
|
95cf41c37c
|
chore(core): Enable Daytona sandbox in Instance AI evals (no-changelog) (#29931)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-05-12 07:43:04 +00:00 |
|
Albert Alises
|
bb73952fcc
|
fix(core): Defer credential setup during workflow builds (#30181)
|
2026-05-11 15:46:44 +00:00 |
|
Mutasem Aldmour
|
0feec2fea6
|
fix(core): Make placeholder() return string (no-changelog) (#30100)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-05-11 11:32:35 +00:00 |
|
Mutasem Aldmour
|
d0367a00e8
|
chore: Align pairwise eval builder with production handover (#30019)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-05-11 11:00:37 +00:00 |
|
José Braulio González Valido
|
30d9a168bc
|
feat(ai-builder): Add --prebuilt-workflows flag for eval CLI (no-changelog) (#29830)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-05-07 11:47:29 +00:00 |
|
Mutasem Aldmour
|
1cb7c591b3
|
chore: Match production builder step cap in pairwise eval (#29977)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-05-07 09:53:36 +00:00 |
|
Bernhard Wittmann
|
68560fbb9a
|
refactor: Extract shared eval helpers (no-changelog) (#29800)
|
2026-05-07 08:05:01 +00:00 |
|
José Braulio González Valido
|
2164afc5df
|
chore(ai-builder): Improve eval comparison alert clarity (no-changelog) (#29929)
CI: Master (Build, Test, Lint) / Build for Github Cache (push) Waiting to run
CI: Master (Build, Test, Lint) / Unit tests (22.x) (push) Waiting to run
CI: Master (Build, Test, Lint) / Unit tests (24.14.1) (push) Waiting to run
CI: Master (Build, Test, Lint) / Unit tests (25.x) (push) Waiting to run
CI: Master (Build, Test, Lint) / Lint (push) Waiting to run
CI: Master (Build, Test, Lint) / Performance (push) Waiting to run
CI: Master (Build, Test, Lint) / Notify Slack on failure (push) Blocked by required conditions
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-05-06 21:20:49 +00:00 |
|
Albert Alises
|
869dc32c15
|
feat(ai-builder): Speeds up Instance AI eval by parallelizing iterations and trimming mock handler (no-changelog) (#29839)
|
2026-05-06 08:15:33 +00:00 |
|
José Braulio González Valido
|
bbe3e2d148
|
feat(ai-builder): Add per-PR eval regression detection vs LangSmith baseline (#29456)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-05-06 08:15:08 +00:00 |
|
Mutasem Aldmour
|
fdceec21b9
|
feat: Add pairwise workflow eval pipeline (#29123)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Jaakko Husso <jaakko@n8n.io>
|
2026-05-04 13:26:27 +00:00 |
|
Charlie Kolb
|
f775604c25
|
refactor: Split up instance-ai confirmation endpoint DTO by action (#29179)
|
2026-05-04 10:47:38 +00:00 |
|
Albert Alises
|
c28d501ba1
|
fix(ai-builder): Stop builder from adding auth to inbound trigger nodes by default (#29648)
|
2026-05-04 10:25:17 +00:00 |
|
Albert Alises
|
625ed5e95a
|
fix(core): Harden Set node workflow SDK contract (#29568)
|
2026-04-30 12:10:44 +00:00 |
|
José Braulio González Valido
|
e7f3e6f771
|
feat(ai-builder): Add three new workflow eval test cases (no-changelog) (#29351)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-04-30 08:11:04 +00:00 |
|
Albert Alises
|
139b803dae
|
fix: Use explicit node references for AI memory session keys (#29473)
|
2026-04-30 07:26:36 +00:00 |
|
José Braulio González Valido
|
4fd68bfc99
|
ci(ai-builder): Parallelize Instance AI eval CI across multiple n8n containers (#29545)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-04-30 07:22:28 +00:00 |
|
José Braulio González Valido
|
54d9286d92
|
fix(ai-builder): Filter LangSmith eval dataset by local file slugs (#29507)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-04-29 13:30:29 +00:00 |
|
José Braulio González Valido
|
e503587854
|
docs: Restructure instanceAI evals README for first-time users (no-changelog) (#29095)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-04-28 07:22:45 +00:00 |
|
José Braulio González Valido
|
29cdd011b0
|
fix(ai-builder): Honor --timeout-ms across the eval harness (no-changelog) (#29219)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-04-27 12:49:01 +00:00 |
|
José Braulio González Valido
|
250b718936
|
fix(ai-builder): Stop tombstoning example UUIDs in LangSmith dataset sync (no-changelog) (#29112)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-04-27 08:52:17 +00:00 |
|
José Braulio González Valido
|
540faa7c1d
|
fix(ai-builder): Update eval dataset prompts to match scenario checklists (no-changelog) (#29011)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-04-24 09:55:18 +00:00 |
|
Benjamin Schroth
|
c961849226
|
feat(ai-builder): Add sub-agent evaluation harness with binary checks (no-changelog) (#28289)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-24 07:50:46 +00:00 |
|
Albert Alises
|
8b33424d0f
|
fix(ai-builder): Stop treating empty defaults as satisfying required for the Split node (#28978)
|
2026-04-23 16:22:19 +00:00 |
|
José Braulio González Valido
|
89e9117d39
|
fix(ai-builder): Expose outputCount when eval node output is truncated (no-changelog) (#28977)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-04-23 11:06:27 +00:00 |
|
José Braulio González Valido
|
16e5f9572f
|
feat(ai-builder): Add LangSmith integration for workflow eval tracking (no-changelog) (#28835)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-04-23 08:47:02 +00:00 |
|
Luca Mattiazzi
|
714981eea3
|
feat: Add multiple runs to instanceAI eval (no-changelog) (#28493)
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
|
2026-04-22 07:11:10 +00:00 |
|
José Braulio González Valido
|
560f300716
|
test: Add Instance AI workflow evals CI pipeline (no-changelog) (#28366)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-20 14:15:41 +00:00 |
|
José Braulio González Valido
|
ac922fa38c
|
feat(ai-builder): Improve eval verifier and mock handler reliability (no-changelog) (#28255)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-10 13:57:32 +00:00 |
|
José Braulio González Valido
|
91b01d27b9
|
feat(ai-builder): Fix IF/Switch/Filter node misconfiguration in builder (no-changelog) (#28172)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-08 14:35:43 +00:00 |
|
José Braulio González Valido
|
fef91c97dd
|
feat(ai-builder): Add --keep-workflows flag and fix eval execution errors (no-changelog) (#28129)
Build: Benchmark Image / build (push) Waiting to run
CI: Master (Build, Test, Lint) / Build for Github Cache (push) Waiting to run
CI: Master (Build, Test, Lint) / Unit tests (22.x) (push) Waiting to run
CI: Master (Build, Test, Lint) / Unit tests (24.14.1) (push) Waiting to run
CI: Master (Build, Test, Lint) / Unit tests (25.x) (push) Waiting to run
CI: Master (Build, Test, Lint) / Lint (push) Waiting to run
CI: Master (Build, Test, Lint) / Performance (push) Waiting to run
CI: Master (Build, Test, Lint) / Notify Slack on failure (push) Blocked by required conditions
Util: Sync API Docs / sync-public-api (push) Waiting to run
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-07 17:35:04 +00:00 |
|