minyoungg / LTELinks

☆69

Alternatives and similar repositories for LTE

Users that are interested in LTE are comparing it to the libraries listed below

Sorting:

FasterDecoding / BitDelta
☆203Updated 11 months ago
TRI-ML / linear_open_lm
A repository for research on medium sized language models.
☆78Updated last year
HanGuo97 / lq-lora
☆128Updated last year
RobertCsordas / moeut
☆89Updated last year
kyleliang919 / Online-Subspace-Descent
[NeurIPS 2024] Low rank memory efficient optimizer without SVD
☆31Updated 5 months ago
Zyphra / tree_attention
Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters
☆130Updated last year
mlfoundations / scaling
Language models scale reliably with over-training and on downstream tasks
☆100Updated last year
VITA-Group / WeLore
[ICML 2025] From Low Rank Gradient Subspace Stabilization to Low-Rank Weights: Observations, Theories and Applications
☆51Updated last month
siyan-zhao / prepacking
The source code of our work "Prepacking: A Simple Method for Fast Prefilling and Increased Throughput in Large Language Models" [AISTATS …
☆60Updated last year
SalesforceAIResearch / GemFilter
☆85Updated 3 weeks ago
google-deepmind / asyncdiloco
☆47Updated last year
jeffreysijuntan / lloco
The official repo for "LLoCo: Learning Long Contexts Offline"
☆118Updated last year
sanyalsunny111 / LLM-Inheritune
This is the official repository for Inheritune.
☆115Updated 9 months ago
BorealisAI / flora-opt
This is the official repository for the paper "Flora: Low-Rank Adapters Are Secretly Gradient Compressors" in ICML 2024.
☆105Updated last year
epfml / schedules-and-scaling
Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"
☆85Updated last year
hahnyuan / PB-LLM
PB-LLM: Partially Binarized Large Language Models
☆157Updated 2 years ago
itsnamgyu / block-transformer
Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)
☆162Updated 7 months ago
ContextualAI / CLAIR_and_APO
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment
☆60Updated last year
snu-mllab / Context-Memory
Pytorch implementation for "Compressed Context Memory For Online Language Model Interaction" (ICLR'24)
☆63Updated last year
sail-sg / SkyLadder
The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling
☆40Updated last month
RobertCsordas / moe_attention
Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"
☆102Updated last year
katiekang1998 / reasoning_generalization
☆33Updated 10 months ago
Infini-AI-Lab / gsm_infinite
☆55Updated 5 months ago
schwartz-lab-NLP / TOVA
Token Omission Via Attention
☆127Updated last year
PiotrNawrot / sparse-frontier
The evaluation framework for training-free sparse attention in LLMs
☆106Updated last month
lucidrains / PEER-pytorch
Pytorch implementation of the PEER block from the paper, Mixture of A Million Experts, by Xu Owen He at Deepmind
☆131Updated last month
lucidrains / llama-qrlhf
Implementation of the Llama architecture with RLHF + Q-learning
☆168Updated 10 months ago
ahans30 / goldfish-loss
[NeurIPS 2024] Goldfish Loss: Mitigating Memorization in Generative LLMs
☆92Updated last year
JacobPfau / fillerTokens
☆75Updated last year
joey00072 / ohara
Collection of autoregressive model implementation
☆86Updated 7 months ago