How Fusion works

Fusion Training grows a model instead of fine-tuning or retraining it. This page covers the idea at a high level — what happens to your model and what you get back.

The core idea

A trained model has knowledge baked into its weights. Fine-tuning risks overwriting that knowledge; training from scratch throws it away. Fusion does neither — it adds new capacity to the model in a way that preserves everything it already knew, then trains only that new capacity on your data.

The result is a larger, more capable model that still knows everything the original did — produced in minutes, not days.

The lifecycle of a run

Diagnose — your model is analyzed to find where it's capacity-constrained and would benefit most from new neurons.
Grow — new capacity is added through a function-preserving expansion: the larger model behaves identically to the original until it's trained. Nothing is lost in the growth.
Train — the new capacity learns from your dataset while the model's existing knowledge is held intact. Runs are checkpointed and resume automatically if a worker is interrupted.
Contract (optional) — the inverse operation: low-value neurons are identified and pruned, shrinking a model 10–25% with minimal quality loss.

Function preservation

The grown-but-untrained model is your original model — same outputs. We report a function_preservation figure on every run so you can see the growth was clean before any training happened.

What you get back

Every run produces a new model in your account — your original is never modified. Each run reports its results: perplexity before and after, parameter counts, and the compute it used. See Monitor a run and GET /jobs/{id}/result.

Next: Launch a run · Pricing

How Fusion works ​

The core idea ​

The lifecycle of a run ​

What you get back ​

How Fusion works

The core idea

The lifecycle of a run

What you get back