Research Atlas
Live Research Index

EXP-0004 — Guided Choice / Preference Signal (Adaptive Sampling)

experiment_id: EXP-0004 · status: draft

EXP-0004 — Guided Choice / Preference Signal (Adaptive Sampling)

Status: Draft

Created: 2025-12-25 Last Updated: 2025-12-25

Hypothesis

If we allow the organism to bias its dataset sampling using a simple adaptive rule, then training and behavior will diverge measurably from fixed mixing weights, because the organism’s learning dynamics will express a stable preference signal (as measured by loss improvement and drift in outputs).

Setup / Test Plan

What stays fixed:

  • Dataset pool (D0, D1, optionally D2 when ready).
  • Total training budget.
  • Prompt suite: organism/prompts/v1.json (prompt_set_id: month1_v1).

What changes:

  • Sampling policy:
    • Control: fixed weights.
    • Variant: adaptive weights using one explicit rule (choose only one):
      • Upsample dataset with best recent loss delta, or
      • Target a mid-entropy / non-collapse band (requires entropy proxy support).

Measurements (Pass/Fail)

Primary:

  • Dataset learning curves:
    • Compare per-dataset dataset_loss_ma improvement rate and resulting weights over time.
  • Behavioral drift:
    • Tag changes in style/content in eval prompts.

Secondary:

  • Overall late-slope and plateau coefficient vs control.

Results

Runs executed:

  • (fill)

Observed:

  • (fill)

Interpretation

  • (fill)

Decision

  • Adopt / Reject / Iterate
  • Next actions:
run_idloss_bestplateautokens_seenprompt_set
No runs linked to this experiment.
Training
Runs + metrics.
Eval
Prompt snapshots.
Insights
Notes + conclusions.