tomaarsen / attention_sinksLinks
Extend existing LLMs way beyond the original training length with constant memory usage, without retraining
☆722Updated last year
Alternatives and similar repositories for attention_sinks
Users that are interested in attention_sinks are comparing it to the libraries listed below
Sorting:
- Code for fine-tuning Platypus fam LLMs using LoRA☆628Updated last year
- [ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning☆662Updated last year
- ☆546Updated 10 months ago
- [ACL2023] We introduce LLM-Blender, an innovative ensembling framework to attain consistently superior performance by leveraging the dive…☆966Updated last year
- A bagel, with everything.☆324Updated last year
- batched loras☆346Updated 2 years ago
- ☆572Updated last year
- Landmark Attention: Random-Access Infinite Context Length for Transformers☆426Updated last year
- Finetuning Large Language Models on One Consumer GPU in 2 Bits☆730Updated last year