corl-team / lime
Official implementation of the paper "You Do Not Fully Utilize Transformer's Representation Capacity"
☆27Updated last month
Alternatives and similar repositories for lime:
Users that are interested in lime are comparing it to the libraries listed below
- ☆21Updated this week
- Official Implementation for "Guide-and-Rescale: Self-Guidance Mechanism for Effective Tuning-Free Real Image Editing"☆52Updated 6 months ago
- Code for the paper "PALBERT: Teaching ALBERT to Ponder", NeurIPS 2022 Spotlight☆37Updated last year
- Vintix: Action Model via In-Context Reinforcement Learning - - —☆33Updated 3 weeks ago
- ☆71Updated 7 months ago
- Compression schema for gradients of activations in backward pass☆44Updated last year
- Pytorch implementation of "Neural Optimal Transport with General Cost Functionals" (ICLR 2024)☆17Updated 7 months ago
- ☆20Updated 8 months ago
- ☆15Updated 2 years ago
- ☆30Updated 4 months ago
- Deep Generative Models course, 2021☆22Updated 3 years ago
- [ICLR 2025] Official PyTorch implementation of "Forgetting Transformer: Softmax Attention with a Forget Gate"☆80Updated this week
- XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning - - —☆66Updated last month
- GULAG: GUessing LAnGuages with neural networks☆13Updated 2 years ago
- Exploration into the proposed "Self Reasoning Tokens" by Felipe Bonetto☆55Updated 10 months ago
- ☆33Updated 6 months ago
- ☆33Updated 2 months ago
- Explorations into the recently proposed Taylor Series Linear Attention☆95Updated 7 months ago
- Official implementation of the paper "Linear Transformers with Learnable Kernel Functions are Better In-Context Models"☆159Updated 2 months ago
- The Gaussian Histogram Loss (HL-Gauss) proposed by Imani et al. with a few convenient wrappers for regression, in Pytorch☆57Updated last month
- Explorations into adversarial losses on top of autoregressive loss for language modeling☆35Updated last month
- ☆22Updated 9 months ago
- Exploration into the Scaling Value Iteration Networks paper, from Schmidhuber's group☆36Updated 6 months ago
- Implementation of the proposed Spline-Based Transformer from Disney Research☆87Updated 4 months ago
- Remasking Discrete Diffusion Models with Inference-Time Scaling☆16Updated 3 weeks ago
- Tiny re-implementation of MDM in style of LLaDA and nano-gpt speedrun☆44Updated 2 weeks ago
- A repo where I play with conditional flow approaches for learning time-varying vector-fields.☆17Updated 9 months ago
- ☆24Updated last year
- ☆51Updated 9 months ago
- Focused on fast experimentation and simplicity☆70Updated 3 months ago