## 1. Project Scaffold

- [ ] 1.1 Initialize package with `package.json`, `tsconfig.json`, module structure (`src/`, `src/cli/`, `src/engine/`, `src/store/`, `src/format/`)
- [ ] 1.2 Add core dependencies: `zod`, `js-yaml`, `nanoid`, `ulid`
- [ ] 1.3 Configure build (tsc or bun build), lint, format, and test scripts
- [ ] 1.4 Create public exports index (`src/index.ts`) with all type and function exports

## 2. Schema Layer — Workflow and Node Types

- [ ] 2.1 Implement `dag-node.ts`: Zod schema for all 7 node types with mutual-exclusivity superRefine, type guards, and AI-field warnings
- [ ] 2.2 Implement `workflow.ts`: WorkflowDefinition schema extending WorkflowBase with nodes array, WorkflowExecutionResult, WorkflowSource types
- [ ] 2.3 Implement `loop.ts`: LoopNodeConfig (prompt, until, max_iterations, fresh_context, interactive, gate_message, until_bash)
- [ ] 2.4 Implement `retry.ts`: Retry config (max_attempts, delay_ms, on_error)
- [ ] 2.5 Implement `workflow-run.ts`: WorkflowRun, WorkflowRunStatus, NodeState, NodeOutput, ApprovalContext schemas

## 3. YAML Format — Loader and Validation

- [ ] 3.1 Implement `loader.ts`: YAML parsing via js-yaml, per-node dagNodeSchema validation, DAG structure validation (unique IDs, depends_on refs, cycle detection via Kahn's)
- [ ] 3.2 Implement `command-validation.ts`: Command name format validation
- [ ] 3.3 Implement `model-validation.ts`: Provider/model resolution (optional — skip before AI provider integration)
- [ ] 3.4 Add workflow-level validation: required fields, provider identity, node ref integrity

## 4. DAG Engine — Core Execution

- [ ] 4.1 Implement `deps.ts`: WorkflowDeps injection interface, IWorkflowPlatform, WorkflowConfig types
- [ ] 4.2 Implement `dag-executor.ts`: Kahn's algorithm topological layering, `buildTopologicalLayers()`, `checkTriggerRule()` (4 trigger rules), Promise.allSettled concurrent layer execution
- [ ] 4.3 Implement node dispatch: execution handlers for PromptNode (AI), CommandNode (command loading), BashNode (subprocess), CancelNode (termination)
- [ ] 4.4 Implement `executor-shared.ts`: `substituteWorkflowVariables()`, `loadCommandPrompt()`, `classifyError()`, `safeSendMessage()`
- [ ] 4.5 Implement `output-ref.ts`: `$nodeId.output` and `$nodeId.output.field` resolution with strict field access
- [ ] 4.6 Implement `condition-evaluator.ts`: `when:` expression parser (==, !=, <, >, <=, >=, AND/OR, comparators with $nodeId.output)
- [ ] 4.7 Implement `event-emitter.ts`: Typed events (workflow_started/completed/failed, node_started/completed/failed/skipped)

## 5. Event Sourcing — Persistence and Replay

- [ ] 5.1 Implement `store.ts`: IWorkflowStore interface (createWorkflowRun, getWorkflowRun, updateWorkflowRun, failWorkflowRun, createWorkflowEvent, getCompletedDagNodeOutputs, getActiveWorkflowRunByPath)
- [ ] 5.2 Implement `executor.ts`: Top-level workflow orchestrator — create run, path-lock guard, dispatch to dag-executor, handle resume with prior completed nodes, event emission
- [ ] 5.3 Implement event persistence: 8 event types stored chronologically, node outputs stored for resume
- [ ] 5.4 Implement resume: `hydrateResumableRun()` loads prior completed node outputs, skips re-execution
- [ ] 5.5 Implement cleanup: retention-based run record and artifact removal

## 6. Storage Backends

- [ ] 6.1 Implement filesystem store: `createFsStore(path)` — run.json per run, events.jsonl, node outputs as JSON files, file-level locking
- [ ] 6.2 Implement SQLite store: `createSqliteStore(path)` — workflow_runs, workflow_events, node_outputs tables with WAL mode
- [ ] 6.3 Implement Postgres store: `createPostgresStore(connectionString)` — same schema as SQLite, pg driver

## 7. Variable Substitution

- [ ] 7.1 Implement workflow-level variable substitution: $WORKFLOW_ID, $ARGUMENTS, $ARTIFACTS_DIR, $BASE_BRANCH, $DOCS_DIR
- [ ] 7.2 Implement node output references in prompts: `$nodeId.output` (full text), `$nodeId.output.field` (structured field access)
- [ ] 7.3 Implement loop-specific variables: `$LOOP_USER_INPUT`, `$LOOP_PREV_OUTPUT`, `$REJECTION_REASON`
- [ ] 7.4 Implement command-level variable substitution: $1-$9 positional args

## 8. Script and Bash Execution

- [ ] 8.1 Implement BashNode execution: `bash -c` subprocess with timeout, stdout capture, env var injection
- [ ] 8.2 Implement ScriptNode — bun runtime: inline `bun -e`, named scripts from `.archon/scripts/`, deps installation
- [ ] 8.3 Implement ScriptNode — uv runtime: `uv run python -c`, named scripts, uv deps installation
- [ ] 8.4 Implement `script-discovery.ts`: discover scripts by extension (.ts→bun, .py→uv) from project and home scopes

## 9. Approval Gates and Human-in-the-Loop

- [ ] 9.1 Implement ApprovalNode handler: pause workflow status, send approval message, store approval context
- [ ] 9.2 Implement approve/resume: transition from paused→running, continue DAG execution
- [ ] 9.3 Implement reject handling: reject node with reason, populate $REJECTION_REASON, execute on_reject prompt if configured
- [ ] 9.4 Implement capture_response: store user comment as $nodeId.output
- [ ] 9.5 Implement interactive loop support: loop.interactive=true pauses between iterations, gate_message shown to user

## 10. Loop Nodes

- [ ] 10.1 Implement LoopNode execution: iterative AI prompt loop with completion signal detection (`until`)
- [ ] 10.2 Implement `max_iterations` enforcement: fail node when exceeded
- [ ] 10.3 Implement `fresh_context` for loop iterations: new session vs. accumulated context
- [ ] 10.4 Implement `until_bash`: bash exit code 0 as completion signal (alternative to text signal)

## 11. CLI Tool (MVP)

- [ ] 11.1 Implement main CLI entry point with subcommand routing (workflow list, run, status, resume)
- [ ] 11.2 Implement `workflow list`: discover and display all workflows with source info
- [ ] 11.3 Implement `workflow run`: execute workflow by name with arguments, --cwd, --store flags
- [ ] 11.4 Implement `workflow status`: display active and recent runs
- [ ] 11.5 Implement `workflow resume`: resume a failed workflow

## 12. Workflow Discovery

- [ ] 12.1 Implement `workflow-discovery.ts`: filesystem discovery across bundled→home→project scopes with precedence
- [ ] 12.2 Implement bundled defaults: embedded default workflows (assist, plan, implement)
- [ ] 12.3 Implement home-global scope: user-level workflows directory
- [ ] 12.4 Implement project scope: repo-local `.workflows/` directory
- [ ] 12.5 Implement resilient loading: per-file error handling, one broken YAML doesn't abort discovery

## 13. Testing

- [ ] 13.1 Unit test DAG executor: topological layering, trigger rules, when conditions, node output refs
- [ ] 13.2 Unit test schema validation: all node types, mutual exclusivity, field validation
- [ ] 13.3 Unit test variable substitution: $nodeId.output, $ARGUMENTS, $LOOP_PREV_OUTPUT edge cases
- [ ] 13.4 Unit test condition evaluator: comparison operators, compound AND/OR, error cases
- [ ] 13.5 Unit test filesystem store: create/read/update runs, events, node outputs, resume data
- [ ] 13.6 Unit test SQLite store: same coverage as filesystem
- [ ] 13.7 Unit test CLI commands: argument parsing, output formatting, approval flow
- [ ] 13.8 Integration test: end-to-end workflow execution with bash and script nodes
- [ ] 13.9 Integration test: resume after failure with prior node outputs loaded