corl-team / rebasedLinks
Official implementation of the paper "Linear Transformers with Learnable Kernel Functions are Better In-Context Models"
☆160Updated 5 months ago
Alternatives and similar repositories for rebased
Users that are interested in rebased are comparing it to the libraries listed below
Sorting:
- ☆70Updated 9 months ago
- Effective LLM Alignment Toolkit☆132Updated last month
- σ-GPT: A New Approach to Autoregressive Models☆65Updated 10 months ago
- ☆15Updated 3 weeks ago
- ☆20Updated 11 months ago
- ☆31Updated 9 months ago
- Focused on fast experimentation and simplicity☆75Updated 6 months ago
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆101Updated 3 months ago
- Evalica, your favourite evaluation toolkit☆37Updated 3 weeks ago
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆101Updated 6 months ago
- NanoGPT-speedrunning for the poor T4 enjoyers☆66Updated 2 months ago
- Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters☆126Updated 6 months ago
- Official PyTorch implementation for Hogwild! Inference: Parallel LLM Generation with a Concurrent Attention Cache☆109Updated 2 months ago
- Fast modular code to create and train cutting edge LLMs☆67Updated last year
- Efficient optimizers☆220Updated last week
- ☆53Updated last year
- PyTorch implementation of models from the Zamba2 series.☆182Updated 5 months ago
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.☆198Updated 11 months ago
- Understand and test language model architectures on synthetic tasks.☆218Updated 2 weeks ago
- A benchmark for role-playing language models☆99Updated last month
- ☆51Updated 3 months ago
- ☆13Updated last year
- Pytorch implementation of the PEER block from the paper, Mixture of A Million Experts, by Xu Owen He at Deepmind☆127Updated 10 months ago
- Framework for processing and filtering datasets☆27Updated 10 months ago
- Griffin MQA + Hawk Linear RNN Hybrid☆87Updated last year
- supporting pytorch FSDP for optimizers☆82Updated 6 months ago
- ☆78Updated 11 months ago
- 2D Positional Embeddings for Webpage Structural Understanding 🦙👀☆95Updated 9 months ago
- Modified Arena-Hard-Auto LLM evaluation toolkit with an emphasis on Russian language☆42Updated 3 months ago
- ☆81Updated last year