Docs › Modes › Engineer

Engineer Mode

The default mode — the complete Propel pipeline from intake to retrospective, with all skills, gates, and auditors active.

Overview

Engineer Mode is the full Propel experience. Every gate fires, every skill is available, and auditors run automatically after code changes. This is the mode you use when you know what you want to build and you are ready for the complete investigation-design-implement-validate cycle.

If you start a session without choosing a mode and just describe a task, Propel defaults to Engineer Mode. It is backward compatible with the full workflow described in the Pipeline Overview.

Property	Details
Active Gates	G0 Q0 G1 Q1 G2 G3 G4
Active Skills	All skills — investigation, deep-research, paper-extraction, research-design, writing-plans, subagent-driven-research, research-validation, systematic-debugging, verification-before-completion, think-deeply, retrospective, context-hygiene, trainer-mode, using-git-worktrees, project-customization
Active Auditors	All — paper-alignment, silent-bug-detector, jax-logic, regression-guard, env-researcher, data-flow-tracer, failure-mode-researcher, code-reviewer
Switch Command	`/switch engineer`

The Complete Pipeline

Engineer Mode activates every phase of the Propel pipeline. Here is the full sequence:

Intake → Q0 → Investigation → G1 → Q1 → Design → G2 → Implementation → G3 (loop) → Validation → Debug → G4 → Training → Retrospective

Intake G0

Claude asks 3–5 scoping questions, one at a time. Questions are specific to your project and designed to expose assumptions, boundaries, and priorities. After enough answers, Claude writes a scope statement for your confirmation.

Grounding Q0

The first Questioner checkpoint. Claude asks for reference implementations, architecture patterns, examples to copy from, and benchmarks. This grounds the work in concrete reference points rather than unconstrained generation.

Investigation G1

Claude creates a scratch/ investigation directory, traces code paths, documents architecture, and records findings in a living README. At Gate 1, presents 3–5 findings, surprises, and open questions. See Investigation.

Details Q1

The second Questioner checkpoint. Nails down implementation specifics: interfaces, data formats, edge cases, integration points, and scope. Answers become binding constraints for design.

Design G2

The research-design skill reads investigation findings and paper notes, creates paper-equation-to-code mappings, identifies regression risks, and proposes config flags. Gate 2 presents the component list, risk assessment, and uncertainties.

Implementation G3

For each component in the approved plan, a 3-stage process runs:

Implement — subagent builds the component from the plan
Review — spec compliance and paper alignment checks
Audit — parallel auditors check for equation mismatches, JAX bugs, silent failures, and regressions

Gate 3 fires after each component, presenting audit results. Every 3 components, Claude offers a /clear pause point.

Validation

Four validation gates in sequence: shape gate (correct output shapes), gradient gate (all parameters receive gradients), overfit gate (memorize 5 samples in 100 steps), and regression gate (existing configs produce identical results). See Validation.

Debug G4

If validation or training fails, the systematic-debugging skill activates: characterize symptoms, form hypotheses, gather evidence, and present diagnosis at Gate 4 before any fix is applied. The 3-strike limit prevents loops.

Training & Retrospective

Training can be launched inline. After the session, the retrospective skill captures learnings in scratch/registry/, including the failed attempts table — often the most valuable artifact.

Auditor Auto-Dispatch

In Engineer Mode, auditors run automatically after every code change. You do not need to invoke them manually:

What Changed	Auditors Dispatched
Paper-derived component	paper-alignment-auditor
JAX code (scan, vmap, pmap, jit)	jax-logic-auditor
Model, loss, or data code	silent-bug-detector
Any code change	regression-guard
Environment interaction code	env-researcher
Deep trace needed (explicit only)	data-flow-tracer

Auditors Are Non-Blocking

Auditors run after code changes and report findings at Gate 3. If they find issues, Claude presents them and asks how to handle them. Clean passes are reported briefly so you know they ran.

When to Use

Implementing paper-derived code — the full pipeline ensures paper alignment at every stage
Building new features — investigation-design-implement-validate cycle catches issues early
Full development cycle — when you need all phases from research through training
When in doubt — Engineer Mode is the default for a reason; it is the safest choice when you are not sure which mode to pick

Example Session

A typical Engineer Mode session implementing a new component:

User:  I want to add residual vector quantization to the existing
       VQ module. Based on the SoundStream paper (Section 3.1).

Claude: [Gate 0 — Intake]
        A few scoping questions:
        1. Should the RVQ support arbitrary depth, or is depth=2
           a hard design choice?

User:  Arbitrary depth, configurable.

Claude: 2. Should existing single-level VQ configs continue to work
           unchanged, or can we modify the VQ interface?

User:  Must be backward compatible. depth=1 should behave exactly
       like the current VQ module.

Claude: Scope: Add configurable-depth RVQ to the VQ module.
        depth=1 preserves existing behavior. Paper reference:
        SoundStream Section 3.1.

        [Q0 — Grounding]
        Which files contain the current VQ implementation I should
        study as the starting point?

User:  model/vq.py and model/encoder.py

Claude: [creates scratch/2026-03-28-rvq/README.md]
        [investigates model/vq.py, model/encoder.py]

        [Gate 1 — Investigation Findings]
        Findings:
        • Current VQ uses EMA codebook updates with stickiness bias
        • Commitment loss is hardcoded to single-level
        • encoder.py calls vq.quantize() — single call site

        [Q1 — Implementation Details]
        The current commitment loss assumes a single codebook.
        For RVQ, should I use per-level scaling (1/num_levels)
        or separate configurable weights per level?

User:  Per-level scaling for now, keep it simple.

Claude: [Gate 2 — Design]
        Components (in order):
        1. RVQ core — residual loop in model/vq.py (Eq. 1-3)
        2. Per-level commitment loss (Eq. 5)
        3. Backward-compatible config flag (depth=1 default)
        4. Integration with encoder.py

        Approve plan?

User:  Go.

Claude: [implements component 1]
        [auditors: paper-alignment ✓, jax-logic ✓, silent-bug ✓]

        [Gate 3 — Component 1]
        All auditors passed. Moving to component 2?