Introduction
Fusion Training (internally GRAFT — Growth-guided Routing via Activation-driven Function-preserving Training) is a new category of model training. Instead of fine-tuning existing weights or retraining from scratch, Fusion grows a model: it adds new capacity to the model, then trains only that new capacity while preserving everything the model already knew.
A 1.5B model becomes 2.2B (+680M params) and improves dramatically in minutes — and the approach holds at any model size.
The two operations
| Operation | What it does |
|---|---|
| Train (Fusion) | Expand the model and train the new capacity. The model grows. |
| Contract (inverse GRAFT) | Scan for low-utilization neurons and prune them. The model shrinks, quality held. |
Both run as jobs on ephemeral GPU workers, are checkpointed and preemption-safe, and produce a new model in your account.
How it works
- Diagnose — find where the model is capacity-constrained.
- Grow — a function-preserving expansion that's mathematically identical to the original until trained, so nothing is lost.
- Train — only the new capacity learns; existing knowledge is held intact.
- Contract (optional) — prune low-value neurons to shrink the result.
See How Fusion works for more.
Where to go next
- Quickstart — your first run in a few minutes.
- Authentication — create an API key.
- API Reference — every endpoint.