Settings reference
Every setting you can pass to a run, what it does, its default, and how to choose it. Settings go in the hyperparams object of POST /jobs.
Training settings
For type: "train" runs.
turns — the one that matters
- Default:
1· Typical: 1–3 - The primary lever on quality, time, and cost. A turn is a complete training pass; each additional turn continues from the last and pushes quality higher.
- Choose:
1fast & good ·2recommended (the biggest quality-per-dollar jump — in testing perplexity dropped ~17 → ~11) ·3best, diminishing returns. - Each turn scales the training portion of the bill linearly; the one-time growth fee is unchanged.
- These defaults are tuned starting points, not magic numbers — see Finding the right config for the cheap iterate-to-tune workflow.
output_name
- Default:
model-<id> - The name your new model gets in the catalog. Use something you'll recognize, e.g.
support-bot-v2.
Advanced tuning (optional)
Standard knobs for power users who want to tune. All have working defaults — the Basic form (turns + model + dataset) alone produces a great run.
cycles(default6) ·steps_per_cycle(default1000) — how much training happens within each turn. Total steps =turns × cycles × steps_per_cycle. Keepsteps_per_cyclenear1000(below ~500 under-trains).lr(learning rate, default1e-4) — the training learning rate.seq_len(default256) — max sequence length per example.
The cost equation
training tokens = turns × cycles × steps_per_cycle × sequence_length. Token price is tiered by model size; a separate growth fee covers the capacity added. Always estimate before a large run.
Contraction settings
For type: "contract" runs.
source_model_id
- Required. The trained model to shrink. The source is never modified; you get a new, smaller model.
contraction_ratio
- Default:
0.5· Range: 0.1–0.9 - How aggressively to prune within each targeted layer. Higher = smaller model, more quality risk.
- Choose:
0.25conservative ·0.5balanced ·0.75aggressive (verify quality).
num_layers_to_contract
- Default:
8· Range: 1–(model depth) - How many layers to process. More layers → more total shrinkage.
Advanced
train_steps
- Overrides
cycles × steps_per_cyclewith an explicit step count per turn. Most users should useturns— it's clearer and maps directly to cost.
How settings interact with cost & time
| If you increase… | Quality | Cost | Time |
|---|---|---|---|
turns | ↑↑ | ↑↑ (training only) | ↑↑ |
cycles / steps_per_cycle (per turn) | ↑ | ↑ | ↑ |
| larger base model | ↑ (capability) | ↑↑ (higher token tier + more growth) | ↑ |
contraction_ratio | ↓ (slight) | flat | flat |
Defaults at a glance
json
// training — turns is the lever; the rest are advanced
{ "turns": 1, "cycles": 6, "steps_per_cycle": 1000 }
// contraction
{ "contraction_ratio": 0.5, "num_layers_to_contract": 8 }These defaults produce a strong, general-purpose result for most models — start here, then tune with the guidance above.