docs(instance-ai): update native agents architecture docs

This commit is contained in:
Oleg Ivaniv 2026-05-05 12:55:35 +02:00
parent 7cce9b1621
commit 321af6d7a0
No known key found for this signature in database
6 changed files with 62 additions and 46 deletions

View File

@ -468,13 +468,15 @@ pnpm typecheck
- [x] Rewrite workspace providers against native agents workspace interfaces.
- [x] Replace Mastra MCP usage with native MCP or native dynamic tools.
- [ ] Update LangSmith tracing to native telemetry/events where possible.
- [ ] Update docs that mention Mastra runtime behavior.
- [x] Update docs that mention Mastra runtime behavior.
- [x] Add tests for the native runtime path.
- [x] Verify no `@mastra/*` imports remain.
Current remaining cleanup:
Current remaining implementation gaps:
- Update architecture/tooling docs that still describe the old Mastra runtime.
- Add the destructive/fresh native persistence migration.
- Decide whether LangSmith should move further onto native telemetry/events or
keep the current product-level spans.
## Acceptance Criteria

View File

@ -30,11 +30,11 @@ const data: ExecutionResult = parseExecutionResult(response);
### Zod schemas are the source of truth
Every tool has an input schema (what the LLM sends) and an output schema
(what the tool returns). Mastra uses these schemas to generate tool
(what the tool returns). Native agents use these schemas to generate tool
descriptions for the LLM, validate inputs at runtime, and type-check the
execute function. If the TypeScript type and the Zod schema are defined
separately, they drift — the LLM sees one contract, the code enforces
another, and bugs hide until production.
handler function. If the TypeScript type and the Zod schema are defined
separately, they drift — the LLM sees one contract, the code enforces another,
and bugs hide until production.
```typescript
// NEVER — separate schema and type that can drift
@ -118,10 +118,10 @@ it('should stream tool-call event when agent uses a tool', async () => {
### Test the contract, not the internals
The clean interface boundary (ADR-002) makes each layer testable in
isolation. Verify the contract at each boundary — not the wiring between
them. Tools can be tested without Mastra, the reducer without SSE, adapters
without the agent.
The clean interface boundary (ADR-002) makes each layer testable in isolation.
Verify the contract at each boundary — not the wiring between them. Tools can
be tested without an agent run, the reducer without SSE, adapters without the
agent.
For each tool, test:
- Valid input → expected output shape
@ -203,15 +203,15 @@ When backend and frontend both switch on event types with duplicated logic,
a change to the format requires updating both in lockstep. Extract the
shared part into `@n8n/api-types` or a shared utility.
## Mastra Patterns
## Native Agent Tool Patterns
### Tool definitions
Mastra uses Zod schemas for both runtime validation and LLM tool
descriptions. The `.describe()` strings on schema fields become the
parameter descriptions the LLM sees when deciding how to call a tool.
Missing or vague descriptions lead to bad tool calls. The `outputSchema`
lets Mastra validate return values and gives the LLM structured expectations.
Native agents use Zod schemas for both runtime validation and LLM tool
descriptions. The `.describe()` strings on schema fields become the parameter
descriptions the LLM sees when deciding how to call a tool. Missing or vague
descriptions lead to bad tool calls. The output schema validates return values
and gives the LLM structured expectations.
- Always define both `inputSchema` and `outputSchema`
- Use `.describe()` on Zod fields — these are the LLM's parameter docs

View File

@ -114,12 +114,13 @@ of a fixed taxonomy (Builder, Debugger, Evaluator), the orchestrator specifies:
Sub-agents are stateless (ADR-011), get clean context windows, and publish events
directly to the event bus (ADR-014). They cannot spawn their own sub-agents.
### 3. Observational Memory
### 3. Long-Context Compaction
Mastra's observational memory compresses old messages into dense observations via
background Observer and Reflector agents. Tool-heavy workloads (workflow
definitions, execution results) get 540x compression. This prevents context
degradation over 50+ step autonomous loops (see ADR-016).
Instance AI uses operational compaction to summarize long builder and
orchestrator threads. Tool-heavy workloads (workflow definitions, execution
results) are reduced into compact summaries that keep the current task,
verification state, and unresolved setup details available without preserving
the full raw transcript.
### 4. Structured System Prompt
@ -193,12 +194,12 @@ The agent package — framework-agnostic business logic.
- **Workflow loop** (`workflow-loop/`) — deterministic build→verify→debug state machine for workflow builder agents
- **Workflow builder** (`workflow-builder/`) — TypeScript SDK code parsing, validation, patching, and prompt sections
- **Workspace** (`workspace/`) — sandbox provisioning (Daytona / local), filesystem abstraction, snapshot management
- **Memory** (`memory/`) — title generation, memory configuration
- **Memory** (`memory/`) — title generation and native thread/message helpers
- **Compaction** (`compaction/`) — LLM-based message history summarization for long conversations
- **Storage** (`storage/`) — iteration logs, task storage, planned task storage, workflow loop storage, agent tree snapshots
- **MCP client** (`mcp/`) — manages connections to external MCP servers, schema sanitization for Anthropic compatibility
- **Domain access** (`domain-access/`) — domain gating and access tracking for external URL approval
- **Stream mapping** (`stream/`) — Mastra chunk → canonical event translation, HITL consumption
- **Stream mapping** (`stream/`) — native agent `StreamChunk` → canonical event translation, HITL consumption
- **Event bus interface** (`event-bus/`) — publishing agent events to the thread channel
- **Tracing** (`tracing/`) — LangSmith integration for step-level observability
- **System prompt** (`agent/`) — dynamic context-aware prompt based on instance configuration
@ -293,21 +294,21 @@ Instance AI uses n8n's module system (`@BackendModule`). This means:
## Runtime & Streaming
The agent runtime is built on Mastra's streaming primitives with added
The agent runtime is built on native `@n8n/agents` streaming primitives with
resumability, HITL suspension, and background task management.
### Stream Execution
```
streamAgentRun() → agent.stream() → executeResumableStream()
├─ for each chunk: mapMastraChunkToEvent() → eventBus.publish()
├─ on suspension: wait for confirmation → agent.resumeStream() → loop
└─ return StreamRunResult {status, mastraRunId, text}
├─ for each chunk: mapAgentChunkToEvent() → eventBus.publish()
├─ on suspension: wait for confirmation → agent.resume('stream', ...) → loop
└─ return StreamRunResult {status, agentRunId, text}
```
The `executeResumableStream()` loop consumes Mastra chunks, translates them to
canonical `InstanceAiEvent` schema, publishes to the event bus, and handles HITL
suspension/resume cycles. Two control modes:
The `executeResumableStream()` loop consumes native agent chunks, translates
them to the canonical `InstanceAiEvent` schema, publishes to the event bus, and
handles HITL suspension/resume cycles. Two control modes:
- **Manual** — returns suspension to caller (used by the orchestrator's main run)
- **Auto** — waits for confirmation and resumes automatically (used by background sub-agents)

View File

@ -6,15 +6,18 @@ Today the main consumer is the workflow builder. The agent writes TypeScript fil
## How the Pieces Fit Together
There are three layers between the agent and actual code execution: a workspace abstraction from Mastra, a sandbox provider (Daytona, n8n sandbox service, or local), and the execution runtime inside the sandbox. Here is how they relate:
There are three layers between the agent and actual code execution: the native
agents workspace abstraction, a sandbox provider (Daytona, n8n sandbox service,
or local), and the execution runtime inside the sandbox. Here is how they
relate:
```mermaid
graph TB
subgraph Agent ["Agent Layer"]
LLM[LLM] --> AgentRuntime["Agent Runtime (Mastra)"]
LLM[LLM] --> AgentRuntime["Native Agent Runtime"]
end
subgraph WorkspaceLayer ["Workspace Abstraction (Mastra)"]
subgraph WorkspaceLayer ["Native Workspace Abstraction"]
AgentRuntime --> Workspace["Workspace"]
Workspace --> FS["Filesystem Interface<br/>(read, write, list, edit files)"]
Workspace --> Sandbox["Sandbox Interface<br/>(execute shell commands)"]
@ -46,14 +49,19 @@ graph TB
The agent never talks to Daytona, the n8n sandbox service, or the host filesystem directly. It only sees the Workspace, which exposes two capabilities: a filesystem (read/write/list files) and a sandbox (run shell commands). The Workspace routes those operations to whichever provider is configured.
## Mastra Workspaces
## Native Agent Workspaces
Mastra is the agent framework that Instance AI uses. A Mastra **Workspace** is a pairing of two things:
Instance AI uses the workspace abstraction from `@n8n/agents`. A native
**Workspace** is a pairing of two things:
1. **A Sandbox** — an interface for executing shell commands. It accepts a command string and returns stdout, stderr, and an exit code. Think of it as a remote terminal.
2. **A Filesystem** — an interface for file operations: read, write, list, delete, copy, move. Think of it as a remote disk.
When a Workspace is attached to an agent, Mastra automatically exposes built-in tools to the LLM: `read_file`, `write_file`, `edit_file`, `list_files`, `grep`, `execute_command`, and others. The agent uses these tools naturally in its reasoning loop — it writes a file, runs a command, reads the output, and decides what to do next.
When a Workspace is attached to an agent, the native runtime exposes workspace
tools to the LLM, including `workspace_read_file`, `workspace_write_file`,
`workspace_list_files`, and `workspace_execute_command`. The agent uses these
tools naturally in its reasoning loop — it writes a file, runs a command, reads
the output, and decides what to do next.
The key design property is that the Workspace abstraction is provider-agnostic. The agent's code and prompts are identical regardless of whether the workspace is backed by a remote container or a local directory. The provider choice is purely an infrastructure decision.
@ -110,7 +118,7 @@ sequenceDiagram
D-->>n8n: Sandbox ID
n8n->>S: Write node-types catalog via filesystem API
n8n->>n8n: Wrap sandbox as Mastra Workspace
n8n->>n8n: Wrap sandbox as native Workspace
n8n->>n8n: Inject Workspace into builder agent
Note over S: Agent works inside sandbox
@ -128,7 +136,10 @@ The process starts with a **pre-warmed image**. On first use, n8n builds a Dayto
One thing that cannot be baked into the image is the **node-types catalog** (a searchable index of all available n8n nodes). It is too large for the image build API, so it is written to each sandbox after creation via the filesystem API.
Once the sandbox is provisioned and the catalog is written, n8n wraps it in a Mastra Workspace and hands it to the builder agent. From that point, the agent works autonomously inside the sandbox — writing files, running the compiler, fixing errors, iterating — until it produces a valid workflow.
Once the sandbox is provisioned and the catalog is written, n8n wraps it in a
native Workspace and hands it to the builder agent. From that point, the agent
works autonomously inside the sandbox — writing files, running the compiler,
fixing errors, iterating — until it produces a valid workflow.
### What is inside a Daytona sandbox
@ -143,7 +154,9 @@ Once the sandbox is provisioned and the catalog is written, n8n wraps it in a Ma
## n8n Sandbox Service: API-Backed Alternative
The n8n sandbox service exposes a simple HTTP API for creating sandboxes, executing shell commands, and manipulating files. Instance AI uses it through a custom Mastra sandbox and filesystem adapter.
The n8n sandbox service exposes a simple HTTP API for creating sandboxes,
executing shell commands, and manipulating files. Instance AI uses it through
native sandbox and filesystem adapters.
Builder prewarming follows Daytona-like lazy image instantiation semantics:
- the builder creates an in-memory image placeholder from setup commands

View File

@ -482,9 +482,9 @@ replaying all SSE events.
### How It Works
1. **Mastra V2 messages** — Mastra persists tool invocations, reasoning, and
text in its V2 message format. The backend parses these into rich
`InstanceAiMessage[]` objects with tool calls and flat agent trees.
1. **Native agent messages** — native memory persists tool invocations,
reasoning, and text as `AgentDbMessage` records. The backend parses these
into rich `InstanceAiMessage[]` objects with tool calls and flat agent trees.
2. **Agent tree snapshots** — after each `run-finish`, the backend replays
events through `buildAgentTreeFromEvents()` and stores the resulting tree

View File

@ -707,11 +707,11 @@ everything; sub-agents receive only what they need.
1. Create a file in `src/tools/<domain>/` following the naming convention `<verb>-<noun>.tool.ts`
2. Define input/output schemas with Zod (`.describe()` on fields — these are the LLM's parameter docs)
3. Export a factory function that takes the service context and returns a Mastra tool
3. Export a factory function that takes the service context and returns a native `Tool`
4. Register the tool in `src/tools/index.ts` (in `createAllTools` or `createOrchestrationTools`)
5. If the tool requires a new service method, add it to the interface in `src/types.ts`
and implement it in the backend adapter
6. New native domain tools are automatically available for delegation — the
orchestrator can include them in sub-agent tool subsets via `delegate`
7. For HITL tools, define `suspendSchema` and `resumeSchema` — Mastra handles
the suspension/resume lifecycle automatically
7. For HITL tools, define native suspend/resume schemas so the agent runtime
handles the suspension/resume lifecycle automatically