Documentation — Plexus

Overview

Plexus is a native macOS application that orchestrates autonomous coding agents in parallel. You describe the work in a Simplex spec, Plexus decomposes it into isolated tasks, assigns agents, and merges the results. Your working directory is never modified until you approve.

Spec-driven

Write structured specs with landmarks, constraints, and DONE_WHEN criteria. The linter catches ambiguity before orchestration begins.

Parallel isolation

Every agent runs in its own git worktree branched from HEAD, with no shared state, no race conditions, and no conflicts during execution.

Mix agents

Claude Code, Codex, or Cline. Use one for planning and another for execution, swap models per task, and escalate on failure.

Spec Intelligence

Track spec evolution across runs. Timeline, diffs, and function-to-file footprints in a dedicated git repo inside .plex/specs/.

Installation

Download the DMG from the Plexus homepage and drag the app to Applications. Requires macOS 14 (Sonoma) or later. Universal binary that runs natively on Apple Silicon and Intel.

Or install via the command line:

Shell
curl -fsSL https://thinkwright.ai/plexus/install | sh

On first launch, Plexus will prompt you to select an agent provider and configure your API key. Keys are stored in the macOS Keychain and never written to disk.

No telemetry, no tracking, no accounts. Plexus runs entirely on your machine. API calls go directly to your chosen provider.

Core Workflow

1. Write a Spec

Create a Simplex specification with landmarks, constraints, and DONE_WHEN criteria. The built-in linter validates structure before you proceed.

2. Orchestrate

Plexus decomposes the spec into parallel tasks, creates isolated git worktrees, and launches agents. Monitor progress in real time via the coordinator chat.

3. Review

Each agent's diff is presented independently. Accept, reject, or re-run individual tasks with a different model. Failed tasks support model escalation.

4. Merge

Passing agents merge into the orchestration branch automatically. Review the combined diff, then merge to main in one action. Failed agents can be re-run. Your checkout stays clean until you approve.

Simplex Specs

Plexus uses or generates Simplex specifications. Simplex is a structured format designed for AI agent consumption where every function maps to an agent task with deterministic boundaries, constraints, and success criteria.

Why Simplex instead of free-form prompts:

Deterministic decomposition: each function becomes exactly one agent task with clear file boundaries
Lintable: structural errors, circular dependencies, and ambiguous constraints are caught before orchestration
Reproducible: the same spec produces the same task graph every time
Trackable: Spec Intelligence records how specs evolve across orchestration runs

Spec Anatomy

A Simplex spec has three levels: the project header, landmarks (groups of related work), and functions (individual agent tasks).

Simplex
PROJECT auth-system
GOAL    Add JWT authentication to the API server

LANDMARK token-service "JWT token generation and validation"

  FUNCTION generate-tokens
    CONSTRAINTS
      MODIFY server/auth/jwt.go
      MODIFY server/auth/jwt_test.go
    DONE_WHEN
      "go test ./server/auth/... passes"
      "tokens include exp, sub, and iss claims"

  FUNCTION middleware
    DEPENDS_ON generate-tokens
    CONSTRAINTS
      MODIFY server/middleware/auth.go
    DONE_WHEN
      "401 returned for missing/expired tokens"
      "valid tokens pass through to handler"

LANDMARK integration "Wire auth into existing routes"

  FUNCTION protect-routes
    DEPENDS_ON middleware
    CONSTRAINTS
      MODIFY server/routes.go
    DONE_WHEN
      "all /api/* routes require auth"
      "/health and /login remain public"

Key Concepts

FUNCTION: maps 1:1 to an agent task. Each function has its own worktree and runs in isolation.
CONSTRAINTS: MODIFY, CREATE, READ directives tell the agent exactly which files to touch. Agents are scoped to these boundaries.
DONE_WHEN: concrete, testable success criteria. The agent evaluates these before reporting completion.
DEPENDS_ON: declares ordering between functions. Independent functions run in parallel; dependent ones wait for predecessors.
LANDMARK: groups related functions for organization. Has no effect on execution order.

Linting & Validation

Before orchestration begins, Plexus runs the Simplex linter across your spec. The linter catches:

Structural errors: missing DONE_WHEN, empty functions, malformed headers
Circular dependencies: A DEPENDS_ON B DEPENDS_ON A
Dangling references: DEPENDS_ON pointing to a function that doesn't exist
Constraint conflicts: multiple functions modifying the same file without declaring dependencies
Ambiguous criteria: DONE_WHEN clauses that are vague or untestable

The linter reports warnings and errors with line numbers. Errors block orchestration; warnings are advisory. You can also lint specs on the web at thinkwright.ai/simplex.

Constraint overlap without dependencies is the most common cause of merge conflicts. If two functions both modify routes.go, they must declare a dependency relationship or the linter will flag it.

How It Works

When you start orchestration, Plexus creates a dedicated orchestration branch from your current HEAD. Each function in the spec becomes an agent task that runs in its own git worktree branched from that point.

Agents run in real tmux sessions with full terminal access, so they can run tests, build projects, inspect files, and use any tool available in your shell. Optional nono sandboxing provides kernel-level isolation for Claude Code and Codex agents.

Execution Model

Independent functions launch simultaneously, up to the configured concurrency limit
Dependent functions wait for their predecessors to complete before starting
Failed dependencies do not block siblings; only direct dependents are held
Configurable concurrency: default is 3 for Claude Code, unlimited for other agents

Your checkout is never modified. All agent work happens in isolated worktrees. Changes only appear in your working directory after you explicitly merge them.

Task Decomposition

Each Simplex function becomes exactly one agent task. You don't need to manually wire up dependencies between functions. Plexus infers them automatically from the READS and WRITES declarations in your spec.

Smart Dependency Inference

When a function declares that it READS a resource that another function WRITES, Plexus establishes a dependency. The matching uses four strategies, applied in order:

Exact match: normalized resource names match directly (e.g. READS: auth tokens → WRITES: auth tokens)
Token overlap: significant words in the resource names overlap (e.g. READS: scoring module output → WRITES: scoring module)
Function name reference: a READS declaration mentions another function by name (e.g. READS: project skeleton from bootstrap depends on the bootstrap function)
Keyword lists: colon-delimited lists are expanded and matched (e.g. READS: all modules: types, scoring, recall creates dependencies on any function whose name or WRITES contains those keywords)

The inferred dependencies form a directed acyclic graph, which is topologically sorted into execution waves:

Example: Wave Scheduling
# Wave 1 (parallel) ─ no dependencies
  generate-tokens    → Agent 1
  setup-database     → Agent 2
  add-config-schema  → Agent 3

# Wave 2 (waits for wave 1 to merge)
  middleware         → Agent 1  # READS token output
  seed-data          → Agent 2  # READS database schema

# Wave 3
  protect-routes     → Agent 1  # READS middleware

Deferred Branching

Wave scheduling determines what code each agent can see. Wave 1 agents branch from your current HEAD. After wave 1 completes and merges, wave 2 agents branch from the orchestration branch tip, so they work on top of wave 1's merged output. This continues for each subsequent wave.

A wave 2 agent that depends on database setup will see the actual tables, migrations, and schema that the wave 1 agent created, rather than relying on descriptions in the spec.

Agent Isolation

Each agent receives the full spec for context but is instructed to implement only its assigned function. Every agent runs in its own git worktree, a real isolated checkout of the repository. Your working directory is never modified during orchestration. Agents evaluate their own DONE_WHEN criteria before reporting completion.

Merge Strategy

After agents complete their tasks, Plexus merges each worktree back into the orchestration branch. All merging happens on a dedicated orch branch. Your working branch is never touched until you explicitly accept the results.

Conflict Prevention

Before any merging begins, Plexus takes several steps to prevent conflicts:

READS/WRITES isolation: the dependency graph determines merge order. Agents whose work depends on other agents merge after their dependencies, ensuring each merge builds on prior output
Topological merge order: agents with no dependencies merge first, then agents whose dependencies have all merged, and so on. This mirrors the wave execution order
Pre-flight conflict prediction: after agents complete but before merging, Plexus performs dry-run merges on agent pairs that touched overlapping files and flags predicted conflicts early
Linter enforcement: the spec linter warns about overlapping file constraints between functions that don't declare dependencies, catching potential conflicts before orchestration starts

Multi-Stage Merge Pipeline

For each agent, Plexus attempts to merge its work through up to three escalating strategies:

Stage 1: Standard merge. A normal git merge of the agent's branch into the orch worktree. If the agent modified different files than prior merges, this succeeds cleanly
Stage 2: Patience merge. If the standard merge fails, Plexus aborts it and retries with the patience strategy. This uses a more sophisticated diff algorithm that handles moved code blocks and reformatted sections better than the default
Stage 3: Rebase + merge. If patience also fails, Plexus rebases the agent's branch onto the current orch tip, replaying its commits on top of all previously merged work. This often resolves conflicts from multiple agents modifying adjacent but not identical lines, because the rebase rewrites the agent's changes against the latest state

If all three stages fail, the agent is marked as merge-failed. You then have several options:

Re-run failed: retry the failed agents from scratch. They branch from the current orch tip, which already contains all successfully merged work, giving the agent a fresh attempt with full context of what's been built
Merge passed to main: accept the work that did merge cleanly and bring it into your branch, leaving conflicted work behind
Dismiss: abandon the entire orchestration. Nothing touches your branch. The orch branch and worktrees are cleaned up

Coordinator Chat

The coordinator is a dedicated chat interface that runs alongside orchestration. It has visibility into all agent activity and lets you monitor, steer, and intervene without leaving Plexus.

What the Coordinator Can Do

Monitor progress: see which agents are running, completed, or failed at a glance
Analyze the spec: the coordinator acts as a Simplex enforcer, validating that agent output matches spec constraints
Steer work: adjust scope, clarify requirements, or add context for agents mid-flight
Intervene on stuck tasks: cancel, restart, or reassign tasks that aren't progressing

Open the coordinator with ⌘K during orchestration. The coordinator maintains its own conversation history, separate from the planning chat.

Spec Intelligence

Spec Intelligence tracks how your specs evolve across orchestration runs. Every time you run an orchestration, Plexus snapshots the spec into a dedicated git repository at .plex/specs/ in your project root.

What Gets Tracked

Spec content: the full spec text is committed as a file, so git diff shows exactly what changed between runs
Orchestration metadata: branch name, function count, and pass/fail outcomes are stored in structured commit messages
Footprint: a footprint.json mapping each function to the files it actually produced or modified

Accessing Spec Intelligence

Use the Spec Intelligence menu in the macOS menu bar:

Timeline: chronological view of all orchestration runs with function counts, pass/fail status, and commit hashes
Diff: side-by-side diff of the spec between the last two orchestration runs, showing what was added, removed, or changed
Open .plex/specs/: opens the git repo directly so you can run git log, git diff, or any git command for deeper analysis

The .plex/specs/ repo is a full git repository. You can use standard git tools to analyze spec evolution (git log --stat, git diff HEAD~3, etc.).

Orchestration Reports

After every orchestration, Plexus generates a PLEXUS_REPORT.md in the orchestration branch. The report includes:

Summary: total tasks, pass/fail counts, elapsed time
Per-task results: status, agent used, model, files modified, and any error output
Merge status: which tasks were merged, skipped, or conflicted
Spec hash: the Spec Intelligence commit hash linking back to the spec snapshot

Reports are viewable in Plexus's file explorer with full markdown rendering, including tables, code blocks, blockquotes, and horizontal rules.

Agent Configuration

The planning model and execution model are configured independently. Use a fast model for plan decomposition and a more capable model for complex implementation tasks.

Claude Code

Anthropic's coding agent. Max subscription (no API key needed) or bring your own Anthropic API key. Supports Sonnet, Opus, and Haiku with extended thinking. Default concurrency: 3.

Codex

OpenAI's coding agent. Requires an OpenAI API key. Works with GPT-4o, o3, and other models. No concurrency limit.

Cline

Open-source agent. Bring any API key (Anthropic, OpenAI, Google, or local models via Ollama). No concurrency limit.

Agent Control

Plexus gives you fine-grained control over individual agent tasks during and after orchestration:

During Orchestration

Live monitoring: watch each agent's terminal output in real time by switching tabs (⌘1-9)
Stop individual agents: cancel a stuck or misbehaving agent without affecting others (⌘.)
Coordinator intervention: use the coordinator chat to steer agents or adjust scope mid-flight

After Orchestration

Re-run conflicted: failed agents can be retried against the current orch branch state
Merge passed to main: all passing work merges to the orchestration branch first, then you merge the combined result to main in one action
Dismiss: abandon the orchestration entirely; your working directory is untouched

Timeout Configuration

Agent timeout defaults to 30 minutes (1800 seconds). This accommodates long-running operations like Rust compilation, cargo test on large projects, or complex refactors. Configure via ~/.config/plex/config.json:

JSON
{
  "claude_timeout": 1800,
  "max_concurrent": 3,
  "max_attempts": 3
}

Set claude_timeout to 0 to disable the timeout entirely. Plexus will still terminate agent processes cleanly when you cancel orchestration or close the app.

Sandbox (nono)

Plexus integrates with nono, a kernel-level sandbox that constrains what agent processes can access on your system. When enabled, each agent runs inside a nono sandbox with OS-enforced boundaries: actual kernel restrictions via Seatbelt (macOS) and Landlock (Linux), not advisory limits.

Supported Agents

Sandbox support is currently available for Claude Code and Codex, which have built-in nono profiles. Cline and Gemini agents are not sandboxed at this time.

What Gets Restricted

Filesystem: agents can only read and write within their assigned worktree and the repo's .git directory. Access to your home directory, other projects, or system files is denied.
Network: allowed by the agent profile (needed for API calls and package downloads).
Process scope: nono wraps the agent command at launch. It does not persist as a parent process; it applies the sandbox policy and exec's into the agent.

Setup

Install nono via Homebrew:

Shell
brew install always-further/tap/nono

Once installed, Plexus automatically detects nono. A lock icon appears in the sidebar for workspaces using Claude or Codex. Click it to toggle sandboxing per workspace. The setting persists across sessions.

How It Works

When sandbox is enabled, Plexus wraps each agent command with:

Shell
nono run --silent --profile claude-code \
  --allow <worktree> --allow <repo>/.git \
  --workdir <worktree> \
  -- claude --print --system-prompt "..." ...

The --allow flags grant access to the agent's isolated worktree and the repository's git metadata (required for commits in worktrees). Everything else is denied by default.

File Explorer

The sidebar file explorer shows your project tree with language-specific icons and real-time git status indicators:

Language icons: Go, Swift, Rust, TypeScript, JavaScript, Python, Ruby, C/C++, and more are each shown with a distinct icon
Git status badges: M (modified), U (untracked), A (added), D (deleted) appear alongside each file
Markdown preview: click any .md file to render it inline with full table, code block, blockquote, and rule support
Auto-collapse: the file explorer automatically collapses when orchestration starts to maximize space for the activity panes

Keyboard Shortcuts

Shortcut	Action
`⌘N`	New plan
`⌘⏎`	Start orchestration
`⌘1-9`	Switch to agent tab
`⌘[` / `⌘]`	Previous / next tab
`⌘D`	Toggle diff view
`⌘K`	Open coordinator chat
`⌘.`	Stop current agent
`⌘⇧M`	Merge selected task
`⌘E`	Toggle file explorer
`⌘,`	Settings

Configuration

Plexus stores configuration at ~/.config/plex/config.json. All fields are optional, and sensible defaults are used when not specified.

Field	Default	Description
`default_agent`	`"claude"`	Agent provider for new tasks: `claude`, `codex`, or `cline`
`work_dir`	(cwd)	Default project directory on launch
`max_attempts`	`3`	Maximum retry attempts per failed task
`max_concurrent`	`0`	Concurrent agent limit. `0` = auto (3 for Claude, unlimited for others)
`claude_timeout`	`1800`	Per-agent timeout in seconds. `0` = no timeout

Example config.json
{
  "default_agent": "claude",
  "max_attempts": 3,
  "max_concurrent": 4,
  "claude_timeout": 3600
}