epfml / landmark-attentionLinks

Landmark Attention: Random-Access Infinite Context Length for Transformers

☆426

Alternatives and similar repositories for landmark-attention

Users that are interested in landmark-attention are comparing it to the libraries listed below

Sorting:

imoneoi / multipack
Multipack distributed sampler for fast padding-free training of LLMs
☆201Updated last year
pratyushasharma / laser
The Truth Is In There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction
☆388Updated last year
Gryphe / BlockMerge_Gradient
Merge Transformers language models by use of gradient parameters.
☆208Updated last year
SkunkworksAI / hydra-moe
☆415Updated last year
jondurbin / bagel
A bagel, with everything.
☆324Updated last year
datamllab / LongLM
[ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
☆662Updated last year
tomaarsen / attention_sinks
Extend existing LLMs way beyond the original training length with constant memory usage, without retraining
☆722Updated last year
zphang / minimal-llama
☆457Updated 2 years ago
sabetAI / BLoRA
batched loras
☆346Updated 2 years ago
Vahe1994 / SpQR
☆546Updated 10 months ago
Guitaricet / relora
Official code for ReLoRA from the paper Stack More Layers Differently: High-Rank Training Through Low-Rank Updates
☆465Updated last year
hydrallm / llama-moe-v1
☆96Updated 2 years ago
Cornell-RelaxML / QuIP
Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees"
☆385Updated last year
DachengLi1 / LongChat
Official repository for LongChat and LongEval
☆531Updated last year
kmeng01 / memit
Mass-editing thousands of facts into a transformer memory (ICLR 2023)
☆519Updated last year
SqueezeAILab / SqueezeLLM
[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
☆704Updated last year
conceptofmind / toolformer
☆371Updated 2 years ago
nlpxucan / evol-instruct
☆274Updated 2 years ago
IST-DASLab / qmoe
Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".
☆277Updated last year
johnsmith0031 / alpaca_lora_4bit
☆534Updated last year
dust-tt / llama-ssp
Experiments on speculative sampling with Llama models
☆125Updated 2 years ago
apoorvumang / prompt-lookup-decoding
☆572Updated last year
HazyResearch / TART
TART: A plug-and-play Transformer module for task-agnostic reasoning
☆200Updated 2 years ago
lm-sys / llm-decontaminator
Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"
☆311Updated last year
uukuguy / multi_loras
Load multiple LoRA modules simultaneously and automatically switch the appropriate combination of LoRA modules to generate the best answe…
☆158Updated last year
QuixiAI / laserRMT
This is our own implementation of 'Layer Selective Rank Reduction'
☆239Updated last year
jondurbin / qlora
QLoRA: Efficient Finetuning of Quantized LLMs
☆76Updated last year
dzhulgakov / llama-mistral
Inference code for Mistral and Mixtral hacked up into original Llama implementation
☆368Updated last year
yuhuixu1993 / qa-lora
Official PyTorch implementation of QA-LoRA
☆141Updated last year
persimmon-ai-labs / adept-inference
Inference code for Persimmon-8B
☆413Updated 2 years ago