Skip to content

Introduction

Fusion Training (internally GRAFT — Growth-guided Routing via Activation-driven Function-preserving Training) is a new category of model training. Instead of fine-tuning existing weights or retraining from scratch, Fusion grows a model: it adds new capacity to the model, then trains only that new capacity while preserving everything the model already knew.

A 1.5B model becomes 2.2B (+680M params) and improves dramatically in minutes — and the approach holds at any model size.

The two operations

OperationWhat it does
Train (Fusion)Expand the model and train the new capacity. The model grows.
Contract (inverse GRAFT)Scan for low-utilization neurons and prune them. The model shrinks, quality held.

Both run as jobs on ephemeral GPU workers, are checkpointed and preemption-safe, and produce a new model in your account.

How it works

  1. Diagnose — find where the model is capacity-constrained.
  2. Grow — a function-preserving expansion that's mathematically identical to the original until trained, so nothing is lost.
  3. Train — only the new capacity learns; existing knowledge is held intact.
  4. Contract (optional) — prune low-value neurons to shrink the result.

See How Fusion works for more.

Where to go next

Fusion Training Console