HazyResearch / zoologyView external linksLinks
Understand and test language model architectures on synthetic tasks.
โ252Jan 12, 2026Updated last month
Alternatives and similar repositories for zoology
Users that are interested in zoology are comparing it to the libraries listed below
Sorting:
- Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"โ248Jun 6, 2025Updated 8 months ago
- A MAD laboratory to improve AI architecture designs ๐งชโ138Dec 17, 2024Updated last year
- HGRN2: Gated Linear RNNs with State Expansionโ56Aug 20, 2024Updated last year
- โ53May 20, 2024Updated last year
- Parallel Associative Scan for Language Modelsโ18Jan 8, 2024Updated 2 years ago
- Official PyTorch Implementation of the Longhorn Deep State Space Modelโ56Dec 4, 2024Updated last year
- [NeurIPS 2023 spotlight] Official implementation of HGRN in our NeurIPS 2023 paper - Hierarchically Gated Recurrent Neural Network for Seโฆโ66Apr 24, 2024Updated last year
- โ58Jul 9, 2024Updated last year
- โ35Feb 26, 2024Updated last year
- Accelerated First Order Parallel Associative Scanโ196Jan 7, 2026Updated last month
- โ51Jan 28, 2024Updated 2 years ago
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"โ27Apr 17, 2024Updated last year
- Code for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"โ18Mar 15, 2024Updated last year
- Here we will test various linear attention designs.โ62Apr 25, 2024Updated last year
- Reference implementation of "Softmax Attention with Constant Cost per Token" (Heinsen, 2024)โ24Jun 6, 2024Updated last year
- Repo for "Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture"โ562Dec 28, 2024Updated last year
- ๐ Efficient implementations of state-of-the-art linear attention modelsโ4,379Updated this week
- โ11Oct 11, 2023Updated 2 years ago
- Experiment of using Tangent to autodiff tritonโ82Jan 22, 2024Updated 2 years ago
- Official Code Repository for the paper "Key-value memory in the brain"โ31Feb 25, 2025Updated 11 months ago
- Official code for the paper "Attention as a Hypernetwork"โ47Jun 22, 2024Updated last year
- Triton Implementation of HyperAttention Algorithmโ48Dec 11, 2023Updated 2 years ago
- train with kittens!โ63Oct 25, 2024Updated last year
- Annotated version of the Mamba paperโ496Feb 27, 2024Updated last year
- Combining SOAP and MUONโ19Feb 11, 2025Updated last year
- โ20May 30, 2024Updated last year
- Official implementation of "Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers"โ169Jan 30, 2025Updated last year
- Parallelizing non-linear sequential models over the sequence lengthโ56Jun 23, 2025Updated 7 months ago
- โ29Jul 9, 2024Updated last year
- Checkpointable dataset utilities for foundation model trainingโ32Jan 29, 2024Updated 2 years ago
- Efficient PScan implementation in PyTorchโ17Jan 2, 2024Updated 2 years ago
- Implementation of GateLoop Transformer in Pytorch and Jaxโ92Jun 18, 2024Updated last year
- Fine-Tuning Pre-trained Transformers into Decaying Fast Weightsโ19Oct 9, 2022Updated 3 years ago
- FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Coresโ342Dec 28, 2024Updated last year
- โ106Mar 9, 2024Updated last year
- โ124May 28, 2024Updated last year
- Stick-breaking attentionโ62Jul 1, 2025Updated 7 months ago
- Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"โ251Jan 31, 2025Updated last year
- An annotated implementation of the Hyena Hierarchy paperโ34May 28, 2023Updated 2 years ago