Monitor

A supervised self-monitoring loop for long-running training runs and multi-step command chains. Clarifies the success criteria first, prints them for the user, then iterates check → debug → verify until done.

Token-Costy by Design

/monitor runs an iterative loop that reads progress, evaluates against a checklist, and can dispatch debuggers each iteration. The skill requires explicit user confirmation before entering the loop. If you just need a single check, use a one-shot inspection instead.

Overview

Monitor turns Claude into a patient supervisor for work that takes time: training runs, hyperparameter sweeps, deploy pipelines, long command chains. It is deliberately not a dumb polling loop. Every run has three phases: clarify, loop, exit.

When It Triggers

TriggerDetails
Manual"/monitor", "watch this run", "keep an eye on", "monitor training", "run this and make sure it finishes"
ContextualUser hands off a multi-step chain and wants Claude to supervise through completion

Phase 1 — Clarify Requirements

Before any loop runs, the skill extracts a concrete success checklist, hard blockers (terminate), and debuggable soft failures (try to fix). It prints a reference card for the user:

┌─ /monitor plan ─────────────────────────────────────────────────────┐
│ Goal:        <one line>                                 │
│ Success criteria:                                       │
│   [ ] <item 1>                                          │
│   [ ] <item 2>                                          │
│ Hard blockers (terminate loop):                         │
│   • <item>                                              │
│ Debuggable failures (try to fix):                       │
│   • <item>                                              │
│ Budget: up to N iterations, ~M min between checks       │
└───────────────────────────────────────────────────────┘

The user confirms or edits the plan before the loop starts.

Phase 2 — The Loop

Each iteration runs the same four steps, logged to scratch/monitor/{YYYY-MM-DD}-{short-name}/log.md:

  1. Inspect: cheap reads of the authoritative progress source (tail logs, nvidia-smi, squeue, metric endpoint).
  2. Evaluate: mark each checklist item as met / not-yet / regressed.
  3. Decide: exit on success or hard blocker, dispatch a debugger on soft failure, otherwise schedule the next check via ScheduleWakeup.
  4. Check budget: if iterations ≥ max, stop and report status — never silently keep going.

Phase 3 — Exit Protocol

On exit (success, failure, or budget exhaustion), the skill always produces: outcome, final checklist marks, what was done, what's left, and suggested next step. If the run produced learnings, it recommends /retrospective.

Guardrails

GuardrailWhy
No destructive writes inside the loop without per-action approvalThe loop has authority to read and diagnose, not to rewrite live infra.
Never claim success just because nothing crashedEvaluate against the confirmed checklist — silence is not victory.
Renegotiate if criteria turn out to be ill-formedDon't march toward the wrong goal just because the user wrote it down once.
Keep per-iteration logs to 2–5 linesThe log is for audit, not narrative.