google-deepmind / recurrentgemma
Open weights language model from Google DeepMind, based on Griffin.
☆627Updated last month
Alternatives and similar repositories for recurrentgemma:
Users that are interested in recurrentgemma are comparing it to the libraries listed below
- a small code base for training large models☆288Updated 3 months ago
- Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"☆855Updated last month
- A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.☆524Updated last month
- Annotated version of the Mamba paper☆475Updated last year
- Muon optimizer: +>30% sample efficiency with <3% wallclock overhead☆505Updated last week
- [ICML 2024] CLLMs: Consistency Large Language Models☆386Updated 4 months ago
- Minimalistic 4D-parallelism distributed training framework for education purpose☆948Updated 2 weeks ago
- A JAX research toolkit for building, editing, and visualizing neural networks.☆1,746Updated 3 months ago
- For optimization algorithm research and development.☆498Updated this week
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆557Updated this week
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆297Updated 5 months ago
- ☆214Updated 8 months ago
- PyTorch implementation of models from the Zamba2 series.☆177Updated last month
- A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and full…☆601Updated 3 months ago
- Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"☆223Updated last month
- Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch☆506Updated 4 months ago
- ☆301Updated 9 months ago
- Minimalistic large language model 3D-parallelism training☆1,701Updated this week
- Long context evaluation for large language models☆202Updated 2 weeks ago
- Helpful tools and examples for working with flex-attention☆695Updated this week
- A pure NumPy implementation of Mamba.☆219Updated 8 months ago
- Flash Attention in ~100 lines of CUDA (forward pass only)☆740Updated 2 months ago
- Scalable and Performant Data Loading☆230Updated this week
- ☆393Updated 2 weeks ago
- Quick implementation of nGPT, learning entirely on the hypersphere, from NvidiaAI☆276Updated this week
- Felafax is building AI infra for non-NVIDIA GPUs☆555Updated last month
- Large Context Attention☆690Updated last month
- Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"☆223Updated last month
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆307Updated 3 months ago