Reproducibility Checklist

1 Why This Page Matters

A paper can have an interesting idea and still lose trust if readers cannot tell how to reproduce the result.

Reproducibility is not only about posting code.

It is about making the full evidence path legible:

what was run
under which settings
on which data
with which metrics
and with which sources of variance

2 What This Checklist Is For

Use this page before submission, before release, and before writing camera-ready revisions.

The goal is simple:

could a careful reader reproduce the main claims without guessing the missing pieces?

3 Core Checklist

3.1 Problem Setup

Is the task definition explicit?
Are training, validation, and test splits described clearly?
Are data preprocessing or filtering steps visible?
Are the target metrics defined before the results table appears?

3.2 Model Or Method Specification

Is the method described precisely enough to re-implement?
Are architecture or solver choices stated explicitly?
Are important hyperparameters listed?
Are stopping criteria, tolerances, or iteration budgets reported?

3.3 Data And Evaluation

Is the dataset source named clearly?
Are versions, subsets, or benchmark variants specified?
Are baseline implementations identified?
Are evaluation conditions matched fairly across methods?

3.4 Randomness And Variance

Are seeds, repeated runs, or confidence summaries reported where relevant?
Does the paper say whether results are single-run, averaged, or selected?
Are unstable regimes or high-variance settings acknowledged?

3.5 Compute And Runtime Context

Is the hardware or runtime environment described when efficiency matters?
Are memory, batch size, or wall-clock details reported when they affect the claim?
Are comparisons fair on compute budget when speed or scale is part of the story?

3.6 Artifacts

Is code available, or is release status stated clearly?
Are configuration files, scripts, or command patterns included?
Are trained checkpoints, generated data, or intermediate artifacts needed?
Are licenses or access restrictions on data and code mentioned?

4 Common Failure Modes

tables report results but do not say how hyperparameters were chosen
baseline settings are underdescribed while the proposed method is overdescribed
dataset splits or preprocessing choices are missing
only best runs are shown without saying so
reproducibility depends on hidden engineering choices that never appear in the paper

5 A Practical Submission Loop

Before submission, force the paper through this loop:

ask one teammate to reproduce one main result from the written instructions alone
list every place where they had to guess
move the most important guessed items into the paper, appendix, or artifact guide
check that the paper’s strongest claims still have matching reproducibility support

This loop often reveals gaps faster than another round of prose polishing.

6 How This Connects To The Site

Writing Experiment Sections helps shape the evidence; this page helps make that evidence rerunnable and auditable.
Claim-Evidence Matrix helps decide what evidence must exist before reproducibility details are even useful.
Review and rebuttal is where many reproducibility weaknesses show up as reviewer questions.