Skip to content

Settings reference

Every setting you can pass to a run, what it does, its default, and how to choose it. Settings go in the hyperparams object of POST /jobs.

Training settings

For type: "train" runs.

turns — the one that matters

  • Default: 1 · Typical: 1–3
  • The primary lever on quality, time, and cost. A turn is a complete training pass; each additional turn continues from the last and pushes quality higher.
  • Choose: 1 fast & good · 2 recommended (the biggest quality-per-dollar jump — in testing perplexity dropped ~17 → ~11) · 3 best, diminishing returns.
  • Each turn scales the training portion of the bill linearly; the one-time growth fee is unchanged.
  • These defaults are tuned starting points, not magic numbers — see Finding the right config for the cheap iterate-to-tune workflow.

output_name

  • Default: model-<id>
  • The name your new model gets in the catalog. Use something you'll recognize, e.g. support-bot-v2.

Advanced tuning (optional)

Standard knobs for power users who want to tune. All have working defaults — the Basic form (turns + model + dataset) alone produces a great run.

  • cycles (default 6) · steps_per_cycle (default 1000) — how much training happens within each turn. Total steps = turns × cycles × steps_per_cycle. Keep steps_per_cycle near 1000 (below ~500 under-trains).
  • lr (learning rate, default 1e-4) — the training learning rate.
  • seq_len (default 256) — max sequence length per example.

The cost equation

training tokens = turns × cycles × steps_per_cycle × sequence_length. Token price is tiered by model size; a separate growth fee covers the capacity added. Always estimate before a large run.

Contraction settings

For type: "contract" runs.

source_model_id

  • Required. The trained model to shrink. The source is never modified; you get a new, smaller model.

contraction_ratio

  • Default: 0.5 · Range: 0.1–0.9
  • How aggressively to prune within each targeted layer. Higher = smaller model, more quality risk.
  • Choose: 0.25 conservative · 0.5 balanced · 0.75 aggressive (verify quality).

num_layers_to_contract

  • Default: 8 · Range: 1–(model depth)
  • How many layers to process. More layers → more total shrinkage.

Advanced

train_steps

  • Overrides cycles × steps_per_cycle with an explicit step count per turn. Most users should use turns — it's clearer and maps directly to cost.

How settings interact with cost & time

If you increase…QualityCostTime
turns↑↑↑↑ (training only)↑↑
cycles / steps_per_cycle (per turn)
larger base model↑ (capability)↑↑ (higher token tier + more growth)
contraction_ratio↓ (slight)flatflat

Defaults at a glance

json
// training — turns is the lever; the rest are advanced
{ "turns": 1, "cycles": 6, "steps_per_cycle": 1000 }

// contraction
{ "contraction_ratio": 0.5, "num_layers_to_contract": 8 }

These defaults produce a strong, general-purpose result for most models — start here, then tune with the guidance above.

Next

Fusion Training Console