Settings reference

Every setting you can pass to a run, what it does, its default, and how to choose it. Settings go in the hyperparams object of POST /jobs.

Training settings

For type: "train" runs.

`turns` — the one that matters

Default: 1 · Typical: 1–3
The primary lever on quality, time, and cost. A turn is a complete training pass; each additional turn continues from the last and pushes quality higher.
Choose: 1 fast & good · 2 recommended (the biggest quality-per-dollar jump — in testing perplexity dropped ~17 → ~11) · 3 best, diminishing returns.
Each turn scales the training portion of the bill linearly; the one-time growth fee is unchanged.
These defaults are tuned starting points, not magic numbers — see Finding the right config for the cheap iterate-to-tune workflow.

`output_name`

Default: model-<id>
The name your new model gets in the catalog. Use something you'll recognize, e.g. support-bot-v2.

Advanced tuning (optional)

Standard knobs for power users who want to tune. All have working defaults — the Basic form (turns + model + dataset) alone produces a great run.

cycles (default 6) · steps_per_cycle (default 1000) — how much training happens within each turn. Total steps = turns × cycles × steps_per_cycle. Keep steps_per_cycle near 1000 (below ~500 under-trains).
lr (learning rate, default 1e-4) — the training learning rate.
seq_len (default 256) — max sequence length per example.

The cost equation

training tokens = turns × cycles × steps_per_cycle × sequence_length. Token price is tiered by model size; a separate growth fee covers the capacity added. Always estimate before a large run.

Contraction settings

For type: "contract" runs.

`source_model_id`

Required. The trained model to shrink. The source is never modified; you get a new, smaller model.

`contraction_ratio`

Default: 0.5 · Range: 0.1–0.9
How aggressively to prune within each targeted layer. Higher = smaller model, more quality risk.
Choose: 0.25 conservative · 0.5 balanced · 0.75 aggressive (verify quality).

`num_layers_to_contract`

Default: 8 · Range: 1–(model depth)
How many layers to process. More layers → more total shrinkage.

Advanced

`train_steps`

Overrides cycles × steps_per_cycle with an explicit step count per turn. Most users should use turns — it's clearer and maps directly to cost.

How settings interact with cost & time

If you increase…	Quality	Cost	Time
`turns`	↑↑	↑↑ (training only)	↑↑
`cycles` / `steps_per_cycle` (per turn)	↑	↑	↑
larger base model	↑ (capability)	↑↑ (higher token tier + more growth)	↑
`contraction_ratio`	↓ (slight)	flat	flat

Defaults at a glance

json

// training — turns is the lever; the rest are advanced
{ "turns": 1, "cycles": 6, "steps_per_cycle": 1000 }

// contraction
{ "contraction_ratio": 0.5, "num_layers_to_contract": 8 }

These defaults produce a strong, general-purpose result for most models — start here, then tune with the guidance above.

Train a model · Contract a model
Pricing · Estimate API

Settings reference ​

Training settings ​

turns — the one that matters ​

output_name ​

Advanced tuning (optional) ​

Contraction settings ​

source_model_id ​

contraction_ratio ​

num_layers_to_contract ​

Advanced ​

train_steps ​

How settings interact with cost & time ​

Defaults at a glance ​

Next ​