How Fusion works
Fusion Training grows a model instead of fine-tuning or retraining it. This page covers the idea at a high level — what happens to your model and what you get back.
The core idea
A trained model has knowledge baked into its weights. Fine-tuning risks overwriting that knowledge; training from scratch throws it away. Fusion does neither — it adds new capacity to the model in a way that preserves everything it already knew, then trains only that new capacity on your data.
The result is a larger, more capable model that still knows everything the original did — produced in minutes, not days.
The lifecycle of a run
- Diagnose — your model is analyzed to find where it's capacity-constrained and would benefit most from new neurons.
- Grow — new capacity is added through a function-preserving expansion: the larger model behaves identically to the original until it's trained. Nothing is lost in the growth.
- Train — the new capacity learns from your dataset while the model's existing knowledge is held intact. Runs are checkpointed and resume automatically if a worker is interrupted.
- Contract (optional) — the inverse operation: low-value neurons are identified and pruned, shrinking a model 10–25% with minimal quality loss.
Function preservation
The grown-but-untrained model is your original model — same outputs. We report a function_preservation figure on every run so you can see the growth was clean before any training happened.
What you get back
Every run produces a new model in your account — your original is never modified. Each run reports its results: perplexity before and after, parameter counts, and the compute it used. See Monitor a run and GET /jobs/{id}/result.
Next: Launch a run · Pricing