|
|
||
|---|---|---|
| .. | ||
| DEMO-HANDOVER.md | ||
| emit-payload.mjs | ||
| pick-next.mjs | ||
| README.md | ||
scripts/mutation-health/
Phase 1 substrate for the Mutation Health Observability initiative.
What is mutation testing?
Line coverage tells you which lines your tests execute. Mutation testing tells you which behavioural changes your tests catch. A file can have 100% line coverage and a 0% mutation score: every line runs during the test suite, but no test would fail if the code were silently broken.
How it works
A mutation testing tool (n8n uses Stryker) does this for each source file:
-
Parse the source into an AST.
-
Generate small variants ("mutants") by changing nodes in the AST. Examples:
Mutator Original Mutated Conditional if (item.mode === 'everyX')if (true),if (false)Equality a === ba !== bBoundary value > 0value >= 0Arithmetic return a + breturn a - bString literal 'hello''',"Stryker was here!"Block statement { x(); return; }{}Conditional (ternary) cond ? a : ba,b,cond ? a : a,cond ? b : bThere are ~40 mutator categories. One source line typically produces several mutants.
-
For each mutant, run the test suite against the mutated code. One of these outcomes:
Outcome Meaning Killed At least one test failed → tests caught the change. ✓ Survived All tests passed → tests didn't catch the change. ✗ NoCoverage No test even ran the mutated line. Timeout Tests hung (counted as detected). -
Mutation score =
(killed + timeout) / (killed + timeout + survived + no_coverage). Higher = more load-bearing assertions.
Line coverage vs mutation score — a real example
packages/workflow/src/workflow-checksum.ts:
- Line coverage: 87.09%
- Mutation score: 38.64%
Mutating let hexString = '' to let hexString = "Stryker was here!" survived the test suite. The tests assert that two similar workflows produce different checksums — but never pin the actual output format. Line coverage calls this fine; mutation testing flags it as assertion-light test theatre.
That divergence is exactly why this project exists.
What's in this directory
| File | Purpose |
|---|---|
pick-next.mjs |
Walk <pkg>/src/, merge with the live ledger, return the next source file to mutate |
emit-payload.mjs |
Turn a Stryker summary.json into a BQ-ready writer payload |
The Stryker run itself lives in packages/workflow/scripts/mutate.mjs and is invoked via pnpm --filter=n8n-workflow mutate <src-file>.
The reader and writer webhooks are plain HTTP — the GHA hits them with curl. There is no fetch/post wrapper script; if you want to call them locally, see Local usage.
The BQ table schema lives with the writer workflow (in n8n's internal Quality project), not in this repo — the writer owns the MERGE statement and is the single source of truth.
End-to-end pipeline
[GHA nightly cron, .github/workflows/mutation-health-nightly.yml]
│
├─► curl GET reader webhook → live-ledger.json (current BQ state)
│ │
│ └─► [n8n: QA Mutation Health Reader] ──► SELECT from BQ ledger
│
├─► pick-next.mjs → one source file
│ walks <pkg>/src/, merges with live ledger
│ files missing from ledger are synthesised as `new`
│ priority: new → red → stale → skip green
│ within new: alphabetical
│ within red/stale: lowest score first
│
├─► pnpm --filter=n8n-workflow mutate → summary.json
│
├─► emit-payload.mjs → bq-payload.json
│
└─► curl POST writer webhook → INSERTs event + MERGEs ledger row
↓
[n8n writer workflow: QA: Mutation Health Writer]
↓
┌───────────────────────────────────┐
│ qa_mutation_health_ledger (MERGE) │
│ qa_performance_metrics (INSERT) │
└───────────────────────────────────┘
The writer workflow lives in n8n's internal Quality project. It's created and maintained outside this repo. This README documents the contract it implements.
State transitions
| Trigger | Stored status |
|---|---|
Source file in src/ but no row yet |
synthesised as new at pick time; not stored |
Last run scored ≥ threshold_at_run |
green |
Last run scored < threshold_at_run |
red |
Stored statuses are just two: red and green. new is computed in-memory by the picker for any file in the source tree that has no ledger row yet — the row is only persisted after that file's first scored run. The picker also computes a transient stale state — any green row whose last_checked_at is older than 4 weeks is treated as stale for that pick. No last_checked_sha is needed; no git history is consulted.
Picker priority: new → red → stale → skip fresh green.
- Within
new: alphabetical (rows exit the bucket as they're scored) - Within
red: lowest score first (weakest tests revisited first) - Within
stale: oldestlast_checked_atfirst (natural cycling of long-stable files)
If every row is green and fresh, the picker exits 0 with {"picked": null, "reason": "all-green"} — a healthy "nothing to do" state, not a failure.
Webhook contracts
Two n8n workflows back the pipeline. Both live in the internal Quality project (L8csxtEbFpFOWlf8) and are created/maintained outside this repo. Both run unauthenticated (URL-as-secret pattern, matching existing qa_* workers).
| Endpoint | Method | Workflow | Purpose |
|---|---|---|---|
https://internal.users.n8n.cloud/webhook/mutation-health-writer |
POST | QA: Mutation Health Writer (iYEBmBat8OscRTVq) |
INSERT events + MERGE ledger |
https://internal.users.n8n.cloud/webhook/mutation-health-ledger?package=<name> |
GET | QA: Mutation Health Reader (ZmRsNUwvgfCSq0JI) |
Read current ledger state |
Writer webhook
POST https://internal.users.n8n.cloud/webhook/mutation-health-writer with Content-Type: application/json:
{
"ledger": [
{
"source_file_path": "packages/workflow/src/cron.ts",
"package": "n8n-workflow",
"last_score": 95.12,
"threshold_at_run": 80,
"last_checked_at": "2026-05-22T10:03:55.660Z",
"status": "green",
"mutants_killed": 39,
"mutants_survived": 2,
"mutants_no_coverage": 0,
"mutants_timeout": 0
}
],
"events": [
{
"benchmark_name": "mutation_health",
"value": 95.12,
"timestamp": "2026-05-22T10:03:55.660Z",
"dimensions": {
"package": "n8n-workflow",
"source_file": "packages/workflow/src/cron.ts",
"sha": "095239e175",
"status_after": "green",
"threshold": 80,
"mutants_killed": 39,
"mutants_survived": 2,
"mutants_no_coverage": 0,
"mutants_timeout": 0
}
}
]
}
Either array may be empty (manual smoke tests sometimes send only events).
The writer:
- For each
events[]row →INSERTintoqa_performance_metrics. - For each
ledger[]row →MERGEintoqa_mutation_health_ledgeronsource_file_path. Status is alwaysredorgreen— the picker synthesisesnewin-memory and never posts it.
The webhook URL is delivered to GHA via the MUTATION_HEALTH_WEBHOOK repo secret. The secret URL itself is the only auth (matches existing qa_* writer pattern); rotate the secret if leaked.
Reader webhook
GET https://internal.users.n8n.cloud/webhook/mutation-health-ledger?package=<pkg>:
{
"ledger": [
{
"source_file_path": "packages/workflow/src/cron.ts",
"package": "n8n-workflow",
"last_score": 95.12,
"threshold_at_run": 80,
"last_checked_at": "2026-05-22T10:03:55.660Z",
"status": "green",
"mutants_killed": 39,
"mutants_survived": 2,
"mutants_no_coverage": 0,
"mutants_timeout": 0
}
]
}
The package query param is validated server-side against the same pnpm-workspace allowlist regex used elsewhere in the pipeline; invalid input returns 500. No SQL is constructed or accepted on the client side — the SELECT is hardcoded in the workflow.
Unauthenticated — the URL is not a secret. The data isn't sensitive (file paths + integer scores), but treat the URL as low-trust: anyone with it can read all current ledger state for the queried package.
Threshold (provisional)
Runs use STRYKER_THRESHOLD=80 as a placeholder. The threshold moves to evidence-based after ~4 weeks of accumulated data. Until then, treat red/green verdicts as preliminary.
Local usage
# Run Stryker on one file (the inner loop — also invokable via /n8n:mutation-test skill)
pnpm --filter=n8n-workflow mutate src/cron.ts
# Pull current ledger from BQ
curl --fail -sS \
'https://internal.users.n8n.cloud/webhook/mutation-health-ledger?package=n8n-workflow' \
-o /tmp/ledger.json
# Pick the next file to score
node scripts/mutation-health/pick-next.mjs \
--package-dir packages/workflow \
--ledger-file /tmp/ledger.json
# Build a BQ payload from a Stryker run
node scripts/mutation-health/emit-payload.mjs \
--summary packages/workflow/reports/mutation/summary.json \
--package n8n-workflow
# POST the result (requires MUTATION_HEALTH_WEBHOOK to be set)
curl --fail -sS -X POST \
-H 'Content-Type: application/json' \
--data @packages/workflow/reports/mutation/bq-payload.json \
"$MUTATION_HEALTH_WEBHOOK"