nyunAI / Faster-LLM-Survey
☆40Updated 4 months ago
Related projects: ⓘ
- ☆42Updated 3 weeks ago
- The official repo for "LLoCo: Learning Long Contexts Offline"☆104Updated 3 months ago
- ☆117Updated 7 months ago
- Astraios: Parameter-Efficient Instruction Tuning Code Language Models☆57Updated 5 months ago
- Prune transformer layers☆60Updated 3 months ago
- Codebase accompanying the Summary of a Haystack paper.☆65Updated 2 months ago
- Official PyTorch implementation of DistiLLM: Towards Streamlined Distillation for Large Language Models (ICML 2024)☆106Updated this week
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆39Updated 3 weeks ago
- The Efficiency Spectrum of LLM☆50Updated 9 months ago
- Code for NeurIPS LLM Efficiency Challenge☆52Updated 5 months ago
- ☆136Updated 7 months ago
- Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"☆87Updated 8 months ago
- Implementation of the paper: "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" from Google in pyTO…☆48Updated last week
- ☆45Updated 7 months ago
- Breaking Throughput-Latency Trade-off for Long Sequences with Speculative Decoding☆55Updated this week
- The official implementation of the paper "Demystifying the Compression of Mixture-of-Experts Through a Unified Framework".☆34Updated 2 weeks ago
- "Improving Mathematical Reasoning with Process Supervision" by OPENAI☆55Updated last week
- ☆59Updated last week
- ☆60Updated 5 months ago
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆123Updated 6 months ago
- From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients. Ajay Jaiswal, Lu Yin, Zhenyu Zhang, Shiwei Liu,…☆43Updated 2 months ago
- Small and Efficient Mathematical Reasoning LLMs☆69Updated 7 months ago
- Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".☆81Updated 2 weeks ago
- Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind☆161Updated last week
- Scalable Meta-Evaluation of LLMs as Evaluators☆39Updated 7 months ago
- LongRoPE is a novel method that can extends the context window of pre-trained LLMs to an impressive 2048k tokens.☆82Updated 3 weeks ago
- ☆105Updated this week
- Fast and Robust Early-Exiting Framework for Autoregressive Language Models with Synchronized Parallel Decoding (EMNLP 2023 Long)☆51Updated 3 months ago
- A pipeline to improve skills of large language models☆149Updated this week
- EE-LLM is a framework for large-scale training and inference of early-exit (EE) large language models (LLMs).☆44Updated 3 months ago