High-throughput chips for LLMs
Our goal is to make the best chips physically possible, for the large model needs of frontier labs.
The MatX One chip delivers higher throughput than any announced product while also matching the best latencies of any products. For training and prefill, it excels on FLOPS; for decode and RL it excels on latency, FLOPS, and long-context support.
What we offer
- The highest FLOPS/mm2.
- Weights are typically in SRAM, for low latency. Allows >2000 output tokens/second for large 100-layer MoE models.
- KVs are typically in HBM, to support long context well.
- The most scale-up interconnect of any product.
- Excellent scale-out interconnect, supporting clusters with hundreds of thousands of chips.
- A programming model that gives you direct control over the hardware.
Target workloads
- Training, RL, inference prefill, inference decode.
- Large MoE models work well, as do large dense models. There is no upper limit on model size.
- No small models, no convolutions, no recommenders.
Investors
Jane Street, Situational Awareness LP, Spark Capital, Nat Friedman and Daniel Gross's fund, Triatomic Capital, and many others.