Leiay / looped_transformerView external linksLinks
☆35Dec 12, 2023Updated 2 years ago
Alternatives and similar repositories for looped_transformer
Users that are interested in looped_transformer are comparing it to the libraries listed below
Sorting:
- Official implementation of the transformer (TF) architecture suggested in a paper entitled "Looped Transformers as Programmable Computers…☆30Apr 8, 2023Updated 2 years ago
- Position Coupling: Improving Length Generalization of Arithmetic Transformers Using Task Structure (NeurIPS 2024) + Arithmetic Transfor…☆14Oct 26, 2025Updated 3 months ago
- Combining SOAP and MUON☆19Feb 11, 2025Updated last year
- ☆11Jun 29, 2021Updated 4 years ago
- Code accompanying the paper "A contrastive rule for meta-learning"☆13Oct 31, 2024Updated last year
- ☆12Sep 18, 2024Updated last year
- 💻 Terminal-Agent with Human-in-the-Loop Learning☆34Jan 16, 2026Updated last month
- ☆36Feb 12, 2025Updated last year
- A source-to-source compiler for optimizing CUDA dynamic parallelism by aggregating launches☆15Jun 21, 2019Updated 6 years ago
- PyTorch implementation of StableMask (ICML'24)☆15Jun 27, 2024Updated last year
- Mamba support for transformer lens☆19Sep 17, 2024Updated last year
- ☆20Mar 1, 2023Updated 2 years ago
- Official repo of paper LM2☆47Feb 13, 2025Updated last year
- ☆45Apr 30, 2018Updated 7 years ago
- ☆20Oct 25, 2022Updated 3 years ago
- Tiny re-implementation of MDM in style of LLaDA and nano-gpt speedrun☆56Mar 10, 2025Updated 11 months ago
- Manually implemented quantization-aware training☆23Oct 12, 2022Updated 3 years ago
- ☆27Feb 1, 2023Updated 3 years ago
- benchmarking some transformer deployments☆26Dec 15, 2025Updated 2 months ago
- Omnigrok: Grokking Beyond Algorithmic Data☆62Feb 24, 2023Updated 2 years ago
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆27Apr 17, 2024Updated last year
- Fast matrix multiplication for few-bit integer matrices on CPUs.☆28Mar 19, 2019Updated 6 years ago
- Educational verilog library that supports IEEE754 floating point arithmetic with a parametrizable mantissa and exponent☆32Mar 13, 2025Updated 11 months ago
- Official Repositiory for Spherical Voronoi: Directional Appearance as a Differentiable Partition of the Sphere☆70Jan 29, 2026Updated 2 weeks ago
- Official Code Repository for the paper "Key-value memory in the brain"☆31Feb 25, 2025Updated 11 months ago
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)☆198May 28, 2024Updated last year
- [ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models☆35Jun 12, 2024Updated last year
- Code for 'Why is Winoground Hard? Investigating Failures in Visuolinguistic Compositionality', EMNLP 2022☆31May 29, 2023Updated 2 years ago
- ☆35Apr 12, 2024Updated last year
- PyTorch implementation for our ICLR 2024 paper "Diffusion Generative Flow Samplers: Improving learning signals through partial trajectory…☆26Dec 21, 2023Updated 2 years ago
- Wrappers for open source FPU hardware implementations.☆37Nov 27, 2025Updated 2 months ago
- BitLinear implementation☆35Jan 1, 2026Updated last month
- QJL: 1-Bit Quantized JL transform for KV Cache Quantization with Zero Overhead☆31Jan 27, 2025Updated last year
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"☆89Oct 30, 2024Updated last year
- [CVPR 2023] Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference☆30Mar 14, 2024Updated last year
- ☆32Oct 31, 2024Updated last year
- Official code for `Visual Attention Emerges from Recurrent Sparse Reconstruction' (ICML 2022)☆36Jul 5, 2022Updated 3 years ago
- rebuilds and completes models of protein complexes using AlphaFold2☆15Jan 22, 2026Updated 3 weeks ago
- Kinematic and dynamic models of continuum and articulated soft robots.☆15Nov 22, 2025Updated 2 months ago