Commit Graph

24 Commits

Author SHA1 Message Date
Eugene
00014420b1
refactor(core): Remove multi-agent architecture entry point from AI workflow builder (no-changelog) (#27925)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 16:32:19 +00:00
Eugene
6294b0e56f
feat(ai-builder): Add agent text response evaluation and workflow changes binary check (no-changelog) (#27755)
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 07:49:10 +00:00
Eugene
6314cd4842
feat(ai-builder): Support dataset context and conversation history in evaluations (no-changelog) (#27618)
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
2026-03-30 08:14:01 +00:00
oleg
834966e145
feat(ai-builder): Add binary-checks evaluation suite (no-changelog) (#26415)
Signed-off-by: Oleg Ivaniv <me@olegivaniv.com>
2026-03-04 08:42:05 +00:00
Benjamin Schroth
ce32754088
feat(ai-builder): Add subgraph evaluation framework for responder (no-changelog) (#25419) 2026-02-18 12:33:34 +00:00
Benjamin Schroth
e4ac345eda
feat(ai-builder): Implement workflow execution in evaluations (no-changelog) (#25814) 2026-02-18 08:41:56 +00:00
Eugene
892f086579
feat(core): Add introspection diagnostic tool for AI workflow builder (#25172) 2026-02-12 10:57:44 +00:00
Mutasem Aldmour
9729c2a5da
feat(ai-builder): Add code-base workflow builder (#24535)
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-10 12:24:27 +00:00
Michael Drury
832e580b39
chore(ai-builder): Add CSV output for evaluation results (#25193)
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 09:02:14 +00:00
oleg
24bb638982
refactor(ai-builder): Consolidate AI Workflow Builder agents and simplify prompts (no-changelog) (#25020)
Signed-off-by: Oleg Ivaniv <me@olegivaniv.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
2026-02-02 13:57:48 +00:00
Albert Alises
341976947f
feat(ai-builder): Add webhook notifications with HMAC authentication for AI evaluation results (#24766)
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
2026-01-26 10:16:05 +00:00
Albert Alises
09222733e1
feat(ai-builder): Add webhook notifications for AI evaluation results (#24653) 2026-01-23 09:39:04 +00:00
Albert Alises
99cb5982a0
ci(core): Add automated AI workflow builder evaluations (#24582) 2026-01-21 15:30:45 +00:00
oleg
448522142c
feat(ai-builder): Add per-stage model configuration for evaluations (no-changelog) (#24344)
Signed-off-by: Oleg Ivaniv <me@olegivaniv.com>
2026-01-15 12:58:49 +00:00
oleg
734bed4f84
fix(ai-builder): Remove pairwise multi-gen evals and improve logs (no-changelog) (#24270)
Signed-off-by: Oleg Ivaniv <me@olegivaniv.com>
2026-01-14 08:44:09 +00:00
oleg
f880a74d99
refactor(ai-builder): Implement unified evaluations harness (#23955)
Signed-off-by: Oleg Ivaniv <me@olegivaniv.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
2026-01-13 12:11:13 +00:00
oleg
3504b982b5
chore(ai-builder): Remove legacy agent and make multi-agent default (no-changelog) (#24076)
Signed-off-by: Oleg Ivaniv <me@olegivaniv.com>
2026-01-13 10:24:27 +00:00
oleg
be6d68408d
refactor(ai-builder): Improve pairwise evaluation architecture and LangSmith integration (no-changelog) (#23084) 2025-12-15 12:11:25 +01:00
Michael Drury
33a6aa665c
fix(ai-builder): Allow setting evaluation feature flags via environment variables (no-changelog) (#22813) 2025-12-05 13:37:56 +00:00
Eugene
2b53cc0f5c
feat(ai-builder): CSV input for evals (no-changelog) (#21150) 2025-10-24 14:15:36 +02:00
oleg
cf3b0f5b5a
refactor(ai-builder): Refactor prompt caching evals (#20911)
Signed-off-by: Oleg Ivaniv <me@olegivaniv.com>
2025-10-24 09:34:43 +02:00
Eugene
49aa80fac1
feat: Add programmatic evaluations for workflow builder (no-changelog) (#20214) 2025-10-02 16:40:30 +02:00
Mutasem Aldmour
2ba544284f
fix: Update builder to better work with loops and binary data (no-changelog) (#19040) 2025-09-02 11:41:26 +02:00
oleg
fb3a2ae216
feat: Evaluation framework for AI Workflow Builder (#18016) 2025-08-20 11:11:14 +02:00