☆44Dec 12, 2023Updated 2 years ago
Alternatives and similar repositories for looped_transformer
Users that are interested in looped_transformer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official implementation of the transformer (TF) architecture suggested in a paper entitled "Looped Transformers as Programmable Computers…☆39Apr 8, 2023Updated 3 years ago
- Position Coupling: Improving Length Generalization of Arithmetic Transformers Using Task Structure (NeurIPS 2024) + Arithmetic Transfor…☆14Oct 26, 2025Updated 7 months ago
- ☆20Oct 25, 2022Updated 3 years ago
- ☆31Mar 30, 2026Updated last month
- Code accompanying the paper "A contrastive rule for meta-learning"☆13Oct 31, 2024Updated last year
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Sigmoid Colon: The biologically inspired activation function.☆25Mar 10, 2023Updated 3 years ago
- ☆21Mar 1, 2023Updated 3 years ago
- ☆12Sep 18, 2024Updated last year
- 💻 Terminal-Agent with Human-in-the-Loop Learning☆39Jan 16, 2026Updated 4 months ago
- Residual vector quantization for KV cache compression in large language model☆12Oct 22, 2024Updated last year
- Learning Accurate Decision Trees with Bandit Feedback via Quantized Gradient Descent☆16Sep 8, 2022Updated 3 years ago
- ☆39Mar 29, 2024Updated 2 years ago
- Official Code Repository for the paper "Key-value memory in the brain"☆32Feb 25, 2025Updated last year
- ☆13May 26, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Fit Lasso model to binary rules created from tree ensembles☆12Aug 2, 2017Updated 8 years ago
- ☆37Feb 12, 2025Updated last year
- Official implementation of the paper "Linear Transformers with Learnable Kernel Functions are Better In-Context Models"☆169Jan 16, 2025Updated last year
- ☆28Feb 1, 2023Updated 3 years ago
- Ada-LISTA: Learned Solvers Adaptive to Varying Models☆11Feb 18, 2020Updated 6 years ago
- 中国人民大学 YOJ 题库☆12Jun 9, 2022Updated 3 years ago
- Code repository of AI-Endo☆16Jan 16, 2024Updated 2 years ago
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)☆201May 28, 2024Updated 2 years ago
- ☆17Oct 31, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Official repo of paper LM2☆48Feb 13, 2025Updated last year
- slowly building a set of infinite riddle generators for data-hungry methods☆14Nov 15, 2022Updated 3 years ago
- PyTorch implementation of StableMask (ICML'24)☆15Jun 27, 2024Updated last year
- Code for "Towards Revealing the Mystery behind Chain of Thought: a Theoretical Perspective"☆22Jul 16, 2023Updated 2 years ago
- [ASPDAC23] High Dimensional Yield Estimation using Shrinkage Deep Features and Maximization of Integral Entropy Reduction☆14Oct 9, 2022Updated 3 years ago
- Implementation of "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"☆40Nov 11, 2024Updated last year
- ☆11Jun 29, 2021Updated 4 years ago
- Python package for NN generation from physics☆16Mar 25, 2023Updated 3 years ago
- Official repository for Targeted Unlearning with Single Layer Unlearning Gradient (SLUG), ICML 2025☆18Aug 10, 2025Updated 9 months ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Monitoring a PyTorch Lightning CNN with Weights & Biases☆15Jul 26, 2021Updated 4 years ago
- Implementation of Neurips 2023 Paper "Multi Time Scale World Models"☆17Nov 8, 2024Updated last year
- Omnigrok: Grokking Beyond Algorithmic Data☆65Feb 24, 2023Updated 3 years ago
- A source-to-source compiler for optimizing CUDA dynamic parallelism by aggregating launches☆15Jun 21, 2019Updated 6 years ago
- Localization of Knowledge in Text-to-Image Models☆12Oct 8, 2024Updated last year
- Official repository for our paper, Transformers Learn Higher-Order Optimization Methods for In-Context Learning: A Study with Linear Mode…☆19Nov 19, 2024Updated last year
- Mutual information estimators and benchmarks☆14May 19, 2026Updated last week