Paper Alignment Auditor

Rigorous cross-referencing of code implementations against their source paper's equations, algorithms, and architecture descriptions.

Overview

The Paper Alignment Auditor is a specialized subagent that verifies whether your code faithfully implements what a research paper describes. It maps equations to code, checks architecture details, traces algorithm flows, and cross-checks hyperparameters — catching subtle misalignments before they waste training cycles.

Provide this agent with the paper (as PNGs for figures) and the relevant source files. It produces a structured discrepancy report covering what matches, what diverges, and what remains ambiguous.

PropertyDetails
ToolsRead, Grep, Glob (read-only)
Auto-DispatchYes — after implementing components from a paper
TriggerModel architecture or loss function changes; new paper implementations

Equation-to-Code Verification

The auditor maps each equation in the paper to its code implementation and verifies mathematical fidelity:

Precision Required

The auditor references exact equation numbers, figure numbers, and section numbers from the paper, paired with exact file paths and line numbers from the code. Vague references like "the loss function" are not acceptable.

Architecture Alignment

Structural verification of the model against the paper's described architecture:

Algorithm Flow Verification

The auditor traces the training loop against the paper's algorithm pseudocode:

Hyperparameter Cross-Check

Compares default hyperparameters in code and config against the paper's reported values:

Common Misalignment Patterns

These are the patterns the auditor specifically watches for — each one has caused real bugs in research code:

PatternWhat Goes Wrong
Wrong softmax axisSoftmax applied over the wrong dimension — output looks like probabilities but they sum to 1 along the wrong axis
Missing temperature/scalingA temperature or scaling parameter from the paper is omitted, changing the sharpness of distributions
KL direction reversedKL(q||p) vs KL(p||q) — mode-seeking vs mode-covering behavior. Papers often specify one; code implements the other.
Mean vs sum reductionUsing mean where the paper uses sum (or vice versa) in losses changes the effective learning rate
Off-by-one in sequencesSequence indexing shifted by one position — the model attends to or predicts the wrong time step
Missing positional encodingPositional encoding described in the paper but absent or incorrectly implemented in code
Wrong commitment loss coefficientVQ-VAE style methods with incorrect beta for the commitment loss term
Wrong reward discountingRL components with incorrect discount factor application — especially in multi-step returns

Output Format

The auditor produces a structured report with four sections:

Verified Alignments

Components confirmed to match the paper, with specific equation/section references paired to code locations.

Discrepancies

Each discrepancy includes:

Ambiguities

Cases where the paper is unclear and the code makes an assumption. The auditor flags these for the user to verify with the paper's authors or resolve through ablation.

Not Verified

Components that couldn't be checked with the available information — for example, figures that weren't provided as PNGs, or implementation details not covered in the paper.

Context Is Key

Some deviations from the paper are intentional improvements. The auditor flags them but does not assume they are bugs — it lets the user decide whether each deviation is acceptable.