github-mirror/n8n

Fork 0

mirror of https://github.com/n8n-io/n8n.git synced 2026-06-04 10:39:23 +02:00

Declan Carroll 078e060162

ci: Add duration-based test orchestration for Playwright E2E (#23257 )

2025-12-22 13:01:22 +00:00

5.0 KiB

Raw Blame History

Custom Test Orchestration

Duration-based test distribution across CI shards, using committed metrics for deterministic runs.

Overview

Instead of Playwright's built-in sharding (which distributes by file count), this approach distributes specs by estimated duration using a greedy bin-packing algorithm. This results in more balanced shard execution times.

Key properties:

Deterministic - Same commit always produces same distribution
Re-run safe - Failed jobs re-run the same specs (no full suite re-run)
Community PR friendly - No secrets needed at runtime
Source agnostic - Metrics can come from any test analytics provider

Scripts

Located in packages/testing/playwright/scripts/:

Script	Purpose
`distribute-tests.mjs`	Assigns specs to shards using bin-packing
`fetch-currents-metrics.mjs`	Fetches duration data from Currents API

Metrics File

Committed at .github/test-metrics/playwright.json

Schema

{
  "updatedAt": "2025-01-15T00:00:00.000Z",
  "source": "currents | playwright | manual | <provider>",
  "specs": {
    "tests/e2e/path/to/spec.ts": {
      "avgDuration": 45000,
      "testCount": 5,
      "flakyRate": 0.02
    }
  }
}

Field	Type	Description
`updatedAt`	ISO 8601	When metrics were last refreshed
`source`	string	Where metrics originated
`specs`	object	Map of spec path to metrics
`specs[path].avgDuration`	number	Average duration in milliseconds
`specs[path].testCount`	number	Number of tests in spec
`specs[path].flakyRate`	number	Flakiness rate (0-1)

Only avgDuration is required for distribution. Other fields are informational.

CI Usage

Enabling Custom Orchestration

In the workflow call to playwright-test-reusable.yml:

jobs:
  e2e-tests:
    uses: ./.github/workflows/playwright-test-reusable.yml
    with:
      test-command: pnpm --filter=n8n-playwright test:local
      shards: 8
      use-custom-orchestration: true  # Enable duration-based distribution
    secrets: inherit

How It Works

When use-custom-orchestration: true:

# Each shard runs:
SPECS=$(node packages/testing/playwright/scripts/distribute-tests.mjs $TOTAL_SHARDS $SHARD_INDEX)
pnpm test:local --workers=2 $SPECS

The distribute script:

Reads committed metrics from .github/test-metrics/playwright.json
Sorts specs by duration (descending)
Assigns each spec to the lightest shard (greedy bin-packing)
Outputs space-separated spec paths for the requested shard

Distribution Algorithm

Input: specs sorted by duration [100s, 80s, 60s, 40s, 30s, 20s]
Shards: 3

Step 1: 100s → Shard 0 (lightest)  → [100, 0, 0]
Step 2: 80s  → Shard 1 (lightest)  → [100, 80, 0]
Step 3: 60s  → Shard 2 (lightest)  → [100, 80, 60]
Step 4: 40s  → Shard 2 (lightest)  → [100, 80, 100]
Step 5: 30s  → Shard 1 (lightest)  → [100, 110, 100]
Step 6: 20s  → Shard 0 (lightest)  → [120, 110, 100]

Result: Balanced ~110s per shard instead of uneven distribution

Refreshing Metrics

Using Currents API

CURRENTS_API_KEY=<key> node packages/testing/playwright/scripts/fetch-currents-metrics.mjs --project=<id>

The script:

Fetches test durations from Currents API (last 30 days)
Aggregates by spec file
Validates against pnpm playwright test --list --project="standard:e2e"
Reports drift (stale specs removed, new specs added with 30s default)
Writes to .github/test-metrics/playwright.json

Using Other Sources

Create a script that outputs the same JSON schema. The distribution only requires:

{
  "specs": {
    "tests/e2e/example.spec.ts": { "avgDuration": 45000 }
  }
}

When to Refresh

When new spec files are added (they get 30s default until refreshed)
When specs are deleted/renamed (stale entries are filtered out)
Periodically to capture duration changes (weekly recommended)

Maintenance

Detecting Drift

Run the fetch script - it reports mismatches:

Stale specs (in metrics but not Playwright):
  - tests/e2e/deleted/old.spec.ts

New specs (in Playwright but not metrics, using 30s default):
  - tests/e2e/new/feature.spec.ts

Manual Metrics Entry

For new specs before CI data exists:

{
  "specs": {
    "tests/e2e/new/feature.spec.ts": {
      "avgDuration": 45000,
      "testCount": 3,
      "flakyRate": 0
    }
  }
}

Estimate duration based on similar specs, or use 30000 (30s) as default.

Troubleshooting

Specs not running

Check that the spec path in playwright.json matches exactly what Playwright outputs:

pnpm --filter=n8n-playwright playwright test --list --project="standard:e2e"

Unbalanced shards

Refresh metrics - durations may have changed significantly since last update.

Script errors

# Test distribution locally
node packages/testing/playwright/scripts/distribute-tests.mjs 8 0

# Validate metrics file
cat .github/test-metrics/playwright.json | jq '.specs | length'

5.0 KiB Raw Blame History