Docs › Agents › Paper Alignment Auditor

Paper Alignment Auditor

Rigorous cross-referencing of code implementations against their source paper's equations, algorithms, and architecture descriptions.

Overview

The Paper Alignment Auditor is a specialized subagent that verifies whether your code faithfully implements what a research paper describes. It maps equations to code, checks architecture details, traces algorithm flows, and cross-checks hyperparameters — catching subtle misalignments before they waste training cycles.

Provide this agent with the paper (as PNGs for figures) and the relevant source files. It produces a structured discrepancy report covering what matches, what diverges, and what remains ambiguous.

Property	Details
Tools	Read, Grep, Glob (read-only)
Auto-Dispatch	Yes — after implementing components from a paper
Trigger	Model architecture or loss function changes; new paper implementations

Equation-to-Code Verification

The auditor maps each equation in the paper to its code implementation and verifies mathematical fidelity:

Mathematical operations — checks signs, indices, normalization constants, and reduction axes
Loss function terms — verifies each term matches the paper's objective exactly
Intentional deviations — flags any "simplifications" the implementer may have introduced, even if they seem reasonable. The user decides whether those are acceptable.

Precision Required

The auditor references exact equation numbers, figure numbers, and section numbers from the paper, paired with exact file paths and line numbers from the code. Vague references like "the loss function" are not acceptable.

Architecture Alignment

Structural verification of the model against the paper's described architecture:

Layer ordering — verifies the sequence of layers matches the paper's description
Activation functions — checks that the correct activation is used at each point (ReLU vs GELU vs SiLU matters)
Normalization layers — confirms type (BatchNorm, LayerNorm, RMSNorm) and placement (pre-norm vs post-norm)
Skip connections — verifies residual connections match the paper's diagram
Dimensions — confirms hidden dimensions, number of heads, and bottleneck sizes against the paper's specifications
Encoder/decoder structure — validates the overall architecture matches the paper's diagram

Algorithm Flow Verification

The auditor traces the training loop against the paper's algorithm pseudocode:

Operation ordering — does the code update in the same sequence as the paper describes?
Gradient operations — are stop_gradient / detach operations placed where the paper intends? This is especially important for VAE/VQ-VAE methods, RL policy gradients, and actor-critic architectures.
Sampling procedures — validates reparameterization tricks and stochastic components
Update rules — verifies that target network updates, codebook updates, and other non-gradient updates follow the paper's specification

Hyperparameter Cross-Check

Compares default hyperparameters in code and config against the paper's reported values:

Flags any hyperparameters mentioned in the paper that are missing from the config
Notes hyperparameters in the code that aren't mentioned in the paper (may indicate undocumented implementation choices)
Checks that default values match the paper's reported settings

Common Misalignment Patterns

These are the patterns the auditor specifically watches for — each one has caused real bugs in research code:

Pattern	What Goes Wrong
Wrong softmax axis	Softmax applied over the wrong dimension — output looks like probabilities but they sum to 1 along the wrong axis
Missing temperature/scaling	A temperature or scaling parameter from the paper is omitted, changing the sharpness of distributions
KL direction reversed	`KL(q\|\|p)` vs `KL(p\|\|q)` — mode-seeking vs mode-covering behavior. Papers often specify one; code implements the other.
Mean vs sum reduction	Using `mean` where the paper uses `sum` (or vice versa) in losses changes the effective learning rate
Off-by-one in sequences	Sequence indexing shifted by one position — the model attends to or predicts the wrong time step
Missing positional encoding	Positional encoding described in the paper but absent or incorrectly implemented in code
Wrong commitment loss coefficient	VQ-VAE style methods with incorrect beta for the commitment loss term
Wrong reward discounting	RL components with incorrect discount factor application — especially in multi-step returns

Output Format

The auditor produces a structured report with four sections:

Verified Alignments

Components confirmed to match the paper, with specific equation/section references paired to code locations.

Discrepancies

Each discrepancy includes:

What the paper says (with equation/section reference)
What the code does (with file path and line number)
Severity: Critical (wrong results), Medium (suboptimal), or Minor (cosmetic)
A specific recommendation for what to change

Ambiguities

Cases where the paper is unclear and the code makes an assumption. The auditor flags these for the user to verify with the paper's authors or resolve through ablation.

Not Verified

Components that couldn't be checked with the available information — for example, figures that weren't provided as PNGs, or implementation details not covered in the paper.

Context Is Key

Some deviations from the paper are intentional improvements. The auditor flags them but does not assume they are bugs — it lets the user decide whether each deviation is acceptable.