Data Flow Tracer

Builds a complete, annotated map of how data enters a system, transforms step by step, and exits — across any framework, language, or paradigm.

Overview

The Data Flow Tracer follows data, not code structure. Code is organized by module, class, and function — but data doesn't respect those boundaries. This agent traces the data's actual path, crossing whatever boundaries it crosses, through PyTorch, NumPy, TensorFlow, JAX, C++ pipelines, ROS nodes, or any combination.

PropertyDetails
ToolsRead, Grep, Glob (read-only)
Auto-DispatchYes — when data pipeline or preprocessing changes
TriggerData pipeline changes, environment integration, model input/output modifications

Input-to-Output Tracing

The tracer's primary job is complete end-to-end tracing:

Semantic Annotation

Shapes and dtypes are not enough. The tracer tracks meaning:

Branching and Merging

Real pipelines aren't linear. The tracer tracks:

Boundary Analysis

Special attention at boundaries where data crosses between systems:

BoundaryWhat to Verify
Data loading → preprocessingFile formats parsed correctly? Dtypes preserved or silently cast?
Preprocessing → model inputDoes the model expect the exact format that preprocessing produces?
Model output → postprocessingOutputs denormalized/decoded correctly?
Software → hardwareActions clipped, scaled, and in correct units before actuators?
Between processes/nodesSerialization/deserialization preserves data correctly?
Between frameworksNumPy/PyTorch/JAX conversions handle memory layouts, dtypes, device placement?

Mutation Tracking

Tracks where data gets modified in place:

Missing Transformations

Flags where important transformations are absent:

Output Format

The tracer produces a structured report containing:

Key Principles

No gaps in the trace — a gap is a finding, not something to skip. Read the code, don't infer from names — a function called normalize() might do anything. The bugs are in the transitions — most data flow bugs are in the hand-offs between stages. Units and frames matter — a perfectly shaped tensor in the wrong coordinate frame will produce plausible-looking but wrong behavior.