google-deepmind / recurrentgemmaLinks
Open weights language model from Google DeepMind, based on Griffin.
☆654Updated 6 months ago
Alternatives and similar repositories for recurrentgemma
Users that are interested in recurrentgemma are comparing it to the libraries listed below
Sorting:
- a small code base for training large models☆315Updated 7 months ago
- Annotated version of the Mamba paper☆491Updated last year
- [ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling☆932Updated 2 weeks ago
- Visualize the intermediate output of Mistral 7B☆381Updated 10 months ago
- A JAX research toolkit for building, editing, and visualizing neural networks.☆1,830Updated 5 months ago
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆329Updated last year
- ☆314Updated last year
- A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and full…☆626Updated 8 months ago
- Official repository for the paper "Grokfast: Accelerated Grokking by Amplifying Slow Gradients"☆564Updated last year
- Minimalistic, extremely fast, and hackable researcher's toolbench for GPT models in 307 lines of code. Reaches <3.8 validation loss on wi…☆352Updated last year
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆685Updated last week
- ☆285Updated last year
- Fast bare-bones BPE for modern tokenizer training☆171Updated 5 months ago
- A pure NumPy implementation of Mamba.☆223Updated last year
- [ICML 2024] CLLMs: Consistency Large Language Models☆407Updated last year
- Official codebase for the paper "Beyond A* Better Planning with Transformers via Search Dynamics Bootstrapping".☆375Updated last year
- Reference implementation of Megalodon 7B model☆525Updated 6 months ago
- The repository for the code of the UltraFastBERT paper☆520Updated last year
- Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch☆548Updated 6 months ago
- Repo for "Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture"☆561Updated 11 months ago
- An interactive HTML pretty-printer for machine learning research in IPython notebooks.☆451Updated 3 months ago
- A repository for research on medium sized language models.☆520Updated 5 months ago
- ☆205Updated last week
- A Jax-based library for building transformers, includes implementations of GPT, Gemma, LlaMa, Mixtral, Whisper, SWin, ViT and more.☆297Updated last year
- Pax is a Jax-based machine learning framework for training large scale models. Pax allows for advanced and fully configurable experimenta…☆540Updated 2 weeks ago
- ☆546Updated last year
- Accelerate, Optimize performance with streamlined training and serving options with JAX.☆325Updated this week
- Inference code for Persimmon-8B☆412Updated 2 years ago
- LLM Analytics☆696Updated last year
- For optimization algorithm research and development.☆547Updated 2 weeks ago