Optimize for inference too, not just training FLOPs January 08, 2025 We discuss the importance of and strategies to balance training and inference costs when selecting LLM architectures. Continue reading
Introducing seqax: A Simple and Efficient LLM Research Codebase May 06, 2024 We’re excited to announce seqax, a research-focused LLM codebase that is simple, efficient, and performs well on up to 100 GPUs or TPUs. Everything you need to edit, from the math, to parallelism, to memory footprint, is all there in 500 lines of JAX code. Continue reading