Two months ago I wrote about why we decided to stop treating pretraining like someone else's job.At the time, Trinity Nano Preview and Trinity Mini had just released, and Trinity Large had started training. We were in the middle of our first run so big that you either laughed or got nauseous. Frankly, I felt either we’d end up with a really great base model or fall flat on our faces with a tired wallet.Little did I know, we’d get both.Here’s what we’re shipping, what surprised us, what broke, and what it took to make a 400B sparse MoE behave.We're putting out three variants: Trinity-Large-Preview is lightly post-trained and chat-ready, Trinity-Large-Base is our best pretraining checkpoint after the full 17T recipe, and TrueBase is an early checkpoint from the same run at 10T tokens, without any instruct data or LR anneals. What many would consider a true base model.Trinity-Large is a 400B parameter sparse MoE with 13B active parameters per token. It uses 256 experts with 4 experts active per token. That sparsity ratio is pretty high compared to our peers, save for Llama-4-Maverick: Model Routing (k-of-N) Routing fraction Trinity Large 4-of-256 1.56% DeepSeek-V3 8-of-256 3.13% MiniMax-M2 8-of-256 3.13% GLM-4.5 8-of-160 5.0% Qwen3-235B-A22B 8-of-128 6.25% Llama 4 Maverick 1-of-128 0.78% We originally aimed for a slightly different total size (420B), but we ended up increasing the number of dense layers (from 3 to 6) to help keep routing stable at this sparsity.Trinity-Large-Base is a true frontier-class foundation model. We match and exceed our peers in open-base models across a wide range of benchmarks, including math, coding, scientific reasoning, and raw knowledge absorption.Inference efficiencyWe trained on 2048 Nvidia B300 GPUs. As far as we can tell, it’s the largest (publicly stated, at least) pretraining run done on these machines. That means two things:They’re wicked fast.They’re not cheap.Therefore, we had to make the most of the money we allotted to these m...
First seen: 2026-01-28 21:28
Last seen: 2026-01-29 01:29