n8n/packages/@n8n/ai-workflow-builder.ee/evaluations/programmatic/python/CONFIGURATION.md

22 KiB

Configuration Format Documentation

This document provides comprehensive documentation for the workflow comparison configuration format.

Table of Contents

Overview

Configuration files can be written in either YAML or JSON format. YAML is recommended for readability and easier maintenance. The configuration controls:

  • Cost weights for different types of graph edits
  • Similarity groups to treat similar node types as equivalent
  • Ignore rules to exclude certain nodes or parameters
  • Parameter comparison rules for flexible matching
  • Exemptions for optional nodes
  • Output formatting preferences

Configuration File Format

YAML Example

version: "1.0"
name: "my-config"
description: "My custom configuration"

costs:
  nodes:
    insertion: 10.0
    deletion: 10.0
    # ... more costs

similarity_groups:
  triggers:
    - "n8n-nodes-base.webhook"
    - "n8n-nodes-base.manualTrigger"

ignore:
  node_types:
    - "n8n-nodes-base.stickyNote"

parameter_comparison:
  numeric_tolerance:
    - parameter: "options.temperature"
      tolerance: 0.1

output:
  max_edits: 15

JSON Example

{
  "version": "1.0",
  "name": "my-config",
  "description": "My custom configuration",
  "costs": {
    "nodes": {
      "insertion": 10.0,
      "deletion": 10.0
    }
  },
  "similarity_groups": {
    "triggers": [
      "n8n-nodes-base.webhook",
      "n8n-nodes-base.manualTrigger"
    ]
  }
}

Top-Level Fields

version (string, required)

Configuration format version. Currently "1.0".

version: "1.0"

name (string, optional)

A unique identifier for this configuration.

name: "my-custom-config"

description (string, optional)

Human-readable description of what this configuration does.

description: "Strict comparison for production workflows"

Cost Configuration

The costs section defines penalties for different graph edit operations. These costs directly impact the similarity score.

Structure

costs:
  nodes:
    insertion: <float>
    deletion: <float>
    substitution:
      same_type: <float>
      similar_type: <float>
      different_type: <float>
      trigger_mismatch: <float>

  edges:
    insertion: <float>
    deletion: <float>
    substitution: <float>

  parameters:
    mismatch_weight: <float>
    nested_weight: <float>

Node Costs

costs.nodes.insertion (float, default: 10.0)

Cost penalty when a node exists in the ground truth but is missing from the generated workflow.

Use case: Set higher for stricter matching (e.g., 15.0), lower for lenient matching (e.g., 5.0).

costs:
  nodes:
    insertion: 10.0

costs.nodes.deletion (float, default: 10.0)

Cost penalty when a node exists in the generated workflow but not in the ground truth.

Use case: Set higher to penalize extra nodes more severely.

costs:
  nodes:
    deletion: 15.0

costs.nodes.substitution.same_type (float, default: 1.0)

Cost when two nodes have the same type but different parameters.

Use case:

  • Low values (0.5-1.0): Allow parameter variations
  • High values (2.0-5.0): Require exact parameter matches
costs:
  nodes:
    substitution:
      same_type: 1.0

costs.nodes.substitution.similar_type (float, default: 5.0)

Cost when two nodes are in the same similarity group (see Similarity Groups).

Example: Replacing lmChatOpenAi with lmChatAnthropic (both are LLMs).

costs:
  nodes:
    substitution:
      similar_type: 5.0

costs.nodes.substitution.different_type (float, default: 15.0)

Cost when replacing a node with a completely different type.

Example: Replacing httpRequest with webhook.

costs:
  nodes:
    substitution:
      different_type: 15.0

costs.nodes.substitution.trigger_mismatch (float, default: 50.0)

Special high-cost penalty for trigger node mismatches. Triggers are critical to workflow functionality.

Use case: Keep this high (50.0-100.0) to ensure trigger correctness.

costs:
  nodes:
    substitution:
      trigger_mismatch: 50.0

Edge Costs

costs.edges.insertion (float, default: 5.0)

Cost for a missing connection between nodes.

costs:
  edges:
    insertion: 5.0

costs.edges.deletion (float, default: 5.0)

Cost for an extra connection that shouldn't exist.

costs:
  edges:
    deletion: 5.0

costs.edges.substitution (float, default: 3.0)

Cost for changing the type or properties of a connection.

costs:
  edges:
    substitution: 3.0

Parameter Costs

costs.parameters.mismatch_weight (float, default: 0.5)

Weight multiplier for parameter mismatches within a node.

Formula: parameter_cost = base_cost * mismatch_weight * num_mismatches

costs:
  parameters:
    mismatch_weight: 0.5

costs.parameters.nested_weight (float, default: 0.3)

Weight multiplier for nested/deep parameter differences.

Use case: Set lower to be more forgiving about deep configuration differences.

costs:
  parameters:
    nested_weight: 0.3

Similarity Groups

Similarity groups define sets of node types that should be considered "similar" rather than "different" when substituted. Nodes within the same group incur the similar_type cost instead of different_type.

Structure

similarity_groups:
  <group_name>:
    - "<node_type_1>"
    - "<node_type_2>"
    - "<node_type_3>"

Example

similarity_groups:
  triggers:
    - "n8n-nodes-base.webhook"
    - "n8n-nodes-base.manualTrigger"
    - "n8n-nodes-base.scheduleTrigger"

  ai_llms:
    - "@n8n/n8n-nodes-langchain.lmChatOpenAi"
    - "@n8n/n8n-nodes-langchain.lmChatAnthropic"
    - "@n8n/n8n-nodes-langchain.lmChatOllama"
    - "@n8n/n8n-nodes-langchain.lmChatMistralCloud"

  http_requests:
    - "n8n-nodes-base.httpRequest"
    - "@n8n/n8n-nodes-langchain.toolHttpRequest"

Common Similarity Groups

AI Agents

ai_agents:
  - "n8n-nodes-langchain.agent"
  - "@n8n/n8n-nodes-langchain.agent"
  - "n8n-nodes-langchain.basicAgent"

AI Tools

ai_tools:
  - "@n8n/n8n-nodes-langchain.toolHttpRequest"
  - "@n8n/n8n-nodes-langchain.toolCalculator"
  - "@n8n/n8n-nodes-langchain.toolCode"
  - "@n8n/n8n-nodes-langchain.toolWorkflow"

Ignore Rules

Ignore rules allow you to exclude certain nodes or parameters from comparison. This is useful for:

  • UI-only elements that don't affect workflow execution
  • Metadata fields like IDs and positions
  • Parameters that vary legitimately across implementations

Structure

ignore:
  node_types: [...]
  nodes: [...]
  global_parameters: [...]
  node_type_parameters: {...}
  parameter_paths: [...]

ignore.node_types (list of strings)

Completely ignore nodes of specific types.

Use case: Ignore decorative nodes like sticky notes.

ignore:
  node_types:
    - "n8n-nodes-base.stickyNote"
    - "n8n-nodes-base.comment"

ignore.nodes (list of objects)

Flexible rules for ignoring nodes based on name patterns or other criteria.

Structure:

ignore:
  nodes:
    - pattern: "<regex_pattern>"
      reason: "Why this is ignored"
    - name: "<exact_node_name>"
      reason: "Why this is ignored"
    - node_type: "<node_type>"
      reason: "Why this is ignored"

Example:

ignore:
  nodes:
    - pattern: "^Temp.*"
      reason: "Temporary debugging nodes"
    - name: "Development Only"
      reason: "Used only in development"

ignore.global_parameters (list of strings)

Parameter names to ignore across all node types.

Common use case: Ignore UI-specific metadata.

ignore:
  global_parameters:
    - "position"
    - "id"
    - "notes"
    - "notesInFlow"
    - "color"
    - "disabled"

ignore.node_type_parameters (object)

Parameters to ignore for specific node types.

Structure:

ignore:
  node_type_parameters:
    "<node_type>":
      - "<parameter_path_1>"
      - "<parameter_path_2>"

Example:

ignore:
  node_type_parameters:
    "@n8n/n8n-nodes-langchain.agent":
      - "options.systemMessage"  # Allow different prompts
      - "options.maxIterations"   # Allow iteration variance

    "n8n-nodes-base.httpRequest":
      - "options.timeout"  # Timeout can vary by environment

ignore.parameter_paths (list of strings)

Ignore parameters using path patterns. Supports wildcards:

  • * - matches any single path segment
  • ** - matches any number of path segments

Example:

ignore:
  parameter_paths:
    - "options.*.timeout"      # Ignore timeout in any option
    - "**.temperature"         # Ignore temperature at any nesting level
    - "options.advanced.**"    # Ignore all advanced options

Parameter Comparison Rules

Parameter comparison rules allow for flexible matching of specific parameters, such as numeric tolerance or semantic similarity.

Structure

parameter_comparison:
  fuzzy_match: [...]
  numeric_tolerance: [...]

Fuzzy Match Rules

For semantic or approximate text matching.

Structure:

parameter_comparison:
  fuzzy_match:
    - parameter: "<parameter_path>"
      type: "semantic"
      threshold: <float>
      cost_if_below: <float>
      options:
        <key>: <value>

Example:

parameter_comparison:
  fuzzy_match:
    - parameter: "options.systemMessage"
      type: "semantic"
      threshold: 0.8
      cost_if_below: 3.0
      options:
        model: "sentence-transformers"

Numeric Tolerance Rules

For numeric parameters that should be "close enough" rather than exact.

Structure:

parameter_comparison:
  numeric_tolerance:
    - parameter: "<parameter_path>"
      tolerance: <float>
      cost_if_exceeded: <float>

Example:

parameter_comparison:
  numeric_tolerance:
    - parameter: "options.temperature"
      tolerance: 0.1
      cost_if_exceeded: 2.0

    - parameter: "options.maxTokens"
      tolerance: 100
      cost_if_exceeded: 1.0

    - parameter: "options.topP"
      tolerance: 0.05
      cost_if_exceeded: 1.5

How it works:

  • If |value1 - value2| <= tolerance, parameters are considered equal (no cost)
  • If |value1 - value2| > tolerance, cost_if_exceeded is added to the edit cost

Wildcard Support

Parameter paths support wildcards:

parameter_comparison:
  numeric_tolerance:
    - parameter: "options.*.temperature"
      tolerance: 0.1
      cost_if_exceeded: 2.0

This applies to options.llm.temperature, options.model.temperature, etc.

Exemptions

Exemptions reduce penalties for certain nodes that are optional or conditionally required.

Structure

exemptions:
  optional_in_generated: [...]
  optional_in_ground_truth: [...]

exemptions.optional_in_generated (list of objects)

Nodes that can be missing from the generated workflow without full penalty.

Use case: Ground truth has optional nodes that aren't critical.

Structure:

exemptions:
  optional_in_generated:
    - name_pattern: "<regex>"
      penalty: <float>
      reason: "Why this is optional"
    - node_type: "<node_type>"
      penalty: <float>
      when:
        <condition_key>: <condition_value>

Example:

exemptions:
  optional_in_generated:
    - node_type: "@n8n/n8n-nodes-langchain.memoryBufferWindow"
      penalty: 2.0
      reason: "Memory is optional for simple workflows"

    - name_pattern: ".*Debug.*"
      penalty: 1.0
      reason: "Debug nodes are optional in production"

exemptions.optional_in_ground_truth (list of objects)

Nodes that can exist in the generated workflow as extras without full penalty.

Use case: Generated workflow includes helpful but non-essential nodes.

Example:

exemptions:
  optional_in_ground_truth:
    - node_type: "n8n-nodes-base.set"
      penalty: 3.0
      reason: "Set nodes for data transformation are okay to add"

    - node_type: "@n8n/n8n-nodes-langchain.toolCalculator"
      penalty: 2.0
      reason: "Extra tools are acceptable"

Conditional Exemptions

Use the when clause to apply exemptions conditionally:

exemptions:
  optional_in_generated:
    - node_type: "n8n-nodes-base.errorTrigger"
      penalty: 1.0
      when:
        disabled: true
      reason: "Disabled error handlers are optional"

Connection Rules

Rules for handling workflow connections (edges).

Structure

connections:
  ignore_connection_types: [...]
  equivalent_types: [...]

connections.ignore_connection_types (list of strings)

Connection types to completely ignore during comparison.

connections:
  ignore_connection_types:
    - "main"  # Ignore main data flow connections

connections.equivalent_types (list of lists)

Define groups of connection types that should be treated as equivalent.

Example:

connections:
  equivalent_types:
    - ["main", "ai"]
    - ["error", "fallback"]

This means:

  • main and ai connections are interchangeable
  • error and fallback connections are interchangeable

Output Configuration

Controls how results are formatted and presented.

Structure

output:
  max_edits: <integer>
  group_by: "<grouping_strategy>"
  include_explanations: <boolean>
  include_suggestions: <boolean>

output.max_edits (integer, default: 15)

Maximum number of edit operations to return in the results.

Use case:

  • Set higher (e.g., 20-50) for detailed debugging
  • Set lower (e.g., 5-10) for quick summaries
output:
  max_edits: 15

output.group_by (string, default: "priority")

How to group edit operations in the output.

Options:

  • "priority": Group by priority (critical, major, minor)
  • "type": Group by edit type (node, edge, parameter)
  • "cost": Order by cost (highest first)
output:
  group_by: "priority"

output.include_explanations (boolean, default: true)

Include detailed explanations for each edit operation.

output:
  include_explanations: true

output.include_suggestions (boolean, default: true)

Include suggestions for how to fix issues.

output:
  include_suggestions: true

Examples

Example 1: Strict Production Configuration

For production workflows where exact matching is critical:

version: "1.0"
name: "production-strict"
description: "Strict matching for production workflows"

costs:
  nodes:
    insertion: 20.0
    deletion: 20.0
    substitution:
      same_type: 0.5
      similar_type: 10.0
      different_type: 30.0
      trigger_mismatch: 100.0

  edges:
    insertion: 10.0
    deletion: 10.0
    substitution: 5.0

  parameters:
    mismatch_weight: 1.0
    nested_weight: 0.8

similarity_groups:
  triggers:
    - "n8n-nodes-base.webhook"
    - "n8n-nodes-base.scheduleTrigger"

ignore:
  node_types:
    - "n8n-nodes-base.stickyNote"

  global_parameters:
    - "position"
    - "id"

parameter_comparison:
  numeric_tolerance:
    - parameter: "options.temperature"
      tolerance: 0.05
      cost_if_exceeded: 5.0

output:
  max_edits: 20
  group_by: "priority"
  include_explanations: true
  include_suggestions: true

Example 2: Lenient Development Configuration

For development workflows where flexibility is needed:

version: "1.0"
name: "development-lenient"
description: "Lenient matching for development and testing"

costs:
  nodes:
    insertion: 5.0
    deletion: 5.0
    substitution:
      same_type: 1.0
      similar_type: 3.0
      different_type: 8.0
      trigger_mismatch: 20.0

  edges:
    insertion: 2.0
    deletion: 2.0
    substitution: 1.0

  parameters:
    mismatch_weight: 0.3
    nested_weight: 0.1

similarity_groups:
  ai_llms:
    - "@n8n/n8n-nodes-langchain.lmChatOpenAi"
    - "@n8n/n8n-nodes-langchain.lmChatAnthropic"
    - "@n8n/n8n-nodes-langchain.lmChatOllama"

  ai_tools:
    - "@n8n/n8n-nodes-langchain.toolHttpRequest"
    - "@n8n/n8n-nodes-langchain.toolCalculator"
    - "@n8n/n8n-nodes-langchain.toolCode"

ignore:
  node_types:
    - "n8n-nodes-base.stickyNote"

  global_parameters:
    - "position"
    - "id"
    - "notes"
    - "notesInFlow"
    - "color"
    - "disabled"

  node_type_parameters:
    "@n8n/n8n-nodes-langchain.agent":
      - "options.systemMessage"
      - "options.maxIterations"

parameter_comparison:
  numeric_tolerance:
    - parameter: "options.temperature"
      tolerance: 0.2
      cost_if_exceeded: 1.0

    - parameter: "options.maxTokens"
      tolerance: 500
      cost_if_exceeded: 0.5

exemptions:
  optional_in_generated:
    - node_type: "@n8n/n8n-nodes-langchain.memoryBufferWindow"
      penalty: 1.0
      reason: "Memory is optional"

  optional_in_ground_truth:
    - node_type: "n8n-nodes-base.set"
      penalty: 2.0
      reason: "Data transformation nodes are okay to add"

output:
  max_edits: 10
  group_by: "priority"
  include_explanations: true
  include_suggestions: false

Example 3: AI-Specific Configuration

Optimized for AI workflow comparisons:

version: "1.0"
name: "ai-workflows"
description: "Specialized configuration for AI agent workflows"

costs:
  nodes:
    insertion: 10.0
    deletion: 10.0
    substitution:
      same_type: 1.0
      similar_type: 4.0
      different_type: 15.0
      trigger_mismatch: 50.0

  edges:
    insertion: 5.0
    deletion: 5.0
    substitution: 3.0

  parameters:
    mismatch_weight: 0.4
    nested_weight: 0.2

similarity_groups:
  ai_agents:
    - "n8n-nodes-langchain.agent"
    - "@n8n/n8n-nodes-langchain.agent"
    - "n8n-nodes-langchain.basicAgent"

  ai_llms:
    - "@n8n/n8n-nodes-langchain.lmChatOpenAi"
    - "@n8n/n8n-nodes-langchain.lmChatAnthropic"
    - "@n8n/n8n-nodes-langchain.lmChatOllama"
    - "@n8n/n8n-nodes-langchain.lmChatMistralCloud"
    - "@n8n/n8n-nodes-langchain.lmChatAws"

  ai_tools:
    - "@n8n/n8n-nodes-langchain.toolHttpRequest"
    - "@n8n/n8n-nodes-langchain.toolCalculator"
    - "@n8n/n8n-nodes-langchain.toolCode"
    - "@n8n/n8n-nodes-langchain.toolWorkflow"

  memory_types:
    - "@n8n/n8n-nodes-langchain.memoryBufferWindow"
    - "@n8n/n8n-nodes-langchain.memoryConversation"

ignore:
  node_types:
    - "n8n-nodes-base.stickyNote"

  global_parameters:
    - "position"
    - "id"
    - "notes"
    - "color"

  node_type_parameters:
    "@n8n/n8n-nodes-langchain.agent":
      - "options.systemMessage"  # Prompts can legitimately vary

    "@n8n/n8n-nodes-langchain.lmChatOpenAi":
      - "options.modelName"  # Different models okay

    "@n8n/n8n-nodes-langchain.lmChatAnthropic":
      - "options.modelName"

parameter_comparison:
  numeric_tolerance:
    - parameter: "**.temperature"
      tolerance: 0.15
      cost_if_exceeded: 2.0

    - parameter: "**.maxTokens"
      tolerance: 200
      cost_if_exceeded: 1.0

    - parameter: "**.topP"
      tolerance: 0.1
      cost_if_exceeded: 1.5

exemptions:
  optional_in_generated:
    - node_type: "@n8n/n8n-nodes-langchain.memoryBufferWindow"
      penalty: 2.0
      reason: "Memory is optional for stateless workflows"

    - node_type: "@n8n/n8n-nodes-langchain.toolCalculator"
      penalty: 3.0
      reason: "Calculator tool is optional"

  optional_in_ground_truth:
    - node_type: "@n8n/n8n-nodes-langchain.toolCode"
      penalty: 2.0
      reason: "Additional code tools are acceptable"

output:
  max_edits: 15
  group_by: "priority"
  include_explanations: true
  include_suggestions: true

Loading Configuration

From Preset

# Python CLI
uvx --from . python -m compare_workflows workflow1.json workflow2.json --preset standard

# Python API
from config_loader import load_config
config = load_config("preset:standard")

From File

# Python CLI
uvx --from . python -m compare_workflows workflow1.json workflow2.json --config my-config.yaml

# Python API
from config_loader import load_config
config = load_config("/path/to/my-config.yaml")

Programmatically

from config_loader import WorkflowComparisonConfig

# Create from dictionary
config_dict = {
    "version": "1.0",
    "name": "custom",
    "costs": {
        "nodes": {
            "insertion": 12.0
        }
    }
}

config = WorkflowComparisonConfig._from_dict(config_dict)

Best Practices

  1. Start with a preset: Begin with standard, strict, or lenient and customize from there.

  2. Test iteratively: Make small changes and test to understand the impact on similarity scores.

  3. Use similarity groups: Group related node types to avoid harsh penalties for equivalent substitutions.

  4. Ignore UI elements: Always ignore cosmetic parameters like position, id, color, etc.

  5. Set appropriate tolerances: Use numeric tolerances for parameters that shouldn't need exact matches (e.g., temperature, maxTokens).

  6. Document your changes: Use the description field and comments to explain why you made specific choices.

  7. Version control: Keep configuration files in version control alongside your workflows.

  8. Environment-specific configs: Create different configurations for development, testing, and production environments.

Further Reading