EXP-0002 — Complexity Beats Volume (D0 vs D0+D1)
experiment_id: EXP-0002 · status: draft
EXP-0002 — Complexity Beats Volume (D0 vs D0+D1)
Status: Draft
Created: 2025-12-25 Last Updated: 2025-12-25
Hypothesis
If we inject slightly more complex narrative structure (D1) into the baseline (D0) via a controlled mix, then late-stage learning will improve faster than simply training longer on D0 alone, because broader structure reduces saturation and encourages richer internal representations.
Setup / Test Plan
What stays fixed:
- Model config and training loop (same hyperparameters).
- Total training budget held constant across conditions.
- Prompt suite fixed:
- File:
organism/prompts/v1.json prompt_set_id:month1_v1
- File:
- Eval cadence and deterministic settings fixed.
Conditions:
- Baseline: D0 only (fairy tales).
- Variant: D0 + D1 mix (weights fixed and recorded).
Data:
- D0:
data/staging/phases/phase0a_early-childhood/fairy_tales.jsonl - D1:
data/staging/phases/phase0a_early-childhood/gutenberg_childrens_literature.jsonl - Manifest (authoritative):
data/manifests/month1_manifest_v1.yaml
Runs (example IDs):
EXP-0002-D0-onlyEXP-0002-D0-D1-mix
Measurements (Pass/Fail)
Primary:
- Late slope improvement:
- Compare late-slope (70–90%) vs baseline at equal tokens_seen.
- Plateau coefficient:
- Variant should have a lower plateau coefficient (i.e., less flattening) than D0-only, at equal budget.
Secondary:
- Eval robustness:
- Lower repetition rates on “play probes” prompts without loss of intelligibility.
- Qualitative behavior:
- More consistent tense/POV in the “memory + consistency” prompts.
Confounders:
- D1 dataset quality/cleanliness can dominate results; keep D1 small at first (“baby steps”).
Results
Runs executed:
- (fill)
Observed:
- (fill)
Interpretation
- (fill)
Decision
- Adopt / Reject / Iterate
- Next actions:
Runs
View runs| run_id | loss_best | plateau | tokens_seen | prompt_set |
|---|---|---|---|---|
| No runs linked to this experiment. | ||||
Training
Runs + metrics.
Eval
Prompt snapshots.
Insights
Notes + conclusions.