EXP-0003 — “Play” via Controlled Novelty (Small Perturbations)
experiment_id: EXP-0003 · status: draft
EXP-0003 — “Play” via Controlled Novelty (Small Perturbations)
Status: Draft
Created: 2025-12-25 Last Updated: 2025-12-25
Hypothesis
If we introduce a small, controlled novelty pressure (e.g., lightly perturbed text during sampling, or prompt mutation during evaluation), then the organism will become more robust to minor corruption without destabilizing training, because the model learns invariances and broader pattern tolerance.
Setup / Test Plan
What stays fixed:
- Dataset mix from the chosen week’s baseline (D0 or D0+D1).
- Training budget.
- Prompt suite base version:
organism/prompts/v1.json.
What changes:
- Novelty mechanism (one at a time) with a small rate (e.g., 1–3%):
- Option A (eval-only): prompt corruption (missing punctuation, reordered sentences).
- Option B (training): small perturbations applied to sampled text (requires code support).
Planned runs:
- Control: no novelty.
- Variant: novelty enabled at fixed rate.
Measurements (Pass/Fail)
Primary:
- Eval robustness on “play probes”:
- Higher intelligibility rate and/or lower repetition rates under corrupted prompts vs control.
Secondary:
- Training stability:
- No significant increase in late-loss variance vs control.
Notes:
- If the novelty is eval-only, training curves should be identical; robustness changes are purely behavioral.
Results
Runs executed:
- (fill)
Observed:
- (fill)
Interpretation
- (fill)
Decision
- Adopt / Reject / Iterate
- Next actions:
Runs
View runs| run_id | loss_best | plateau | tokens_seen | prompt_set |
|---|---|---|---|---|
| No runs linked to this experiment. | ||||
Training
Runs + metrics.
Eval
Prompt snapshots.
Insights
Notes + conclusions.