nyunAI / Faster-LLM-SurveyLinks

☆42

Alternatives and similar repositories for Faster-LLM-Survey

Users that are interested in Faster-LLM-Survey are comparing it to the libraries listed below

Sorting:

SeunghyunSEO / optimized_hf_llama_class_for_training
☆48Updated last year
jeffreysijuntan / lloco
The official repo for "LLoCo: Learning Long Contexts Offline"
☆118Updated last year
Upaya07 / NeurIPS-llm-efficiency-challenge
Code for NeurIPS LLM Efficiency Challenge
☆59Updated last year
siyan-zhao / prepacking
The source code of our work "Prepacking: A Simple Method for Fast Prefilling and Increased Throughput in Large Language Models" [AISTATS …
☆60Updated last year
UbiquitousLearning / SLM_Survey
☆98Updated last year
itsnamgyu / block-transformer
Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)
☆162Updated 6 months ago
kyegomez / Infini-attention
Implementation of the paper: "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" from Google in pyTO…
☆56Updated last week
bigcode-project / astraios
Astraios: Parameter-Efficient Instruction Tuning Code Language Models
☆62Updated last year
IST-DASLab / RoSA
Official implementation of the ICML 2024 paper RoSA (Robust Adaptation)
☆44Updated last year
CASE-Lab-UMD / LLM-Drop
The official implementation of the paper "What Matters in Transformers? Not All Attention is Needed".
☆179Updated 7 months ago
CodeCreator / WebOrganizer
Organize the Web: Constructing Domains Enhances Pre-Training Data Curation
☆67Updated 6 months ago
alperiox / Compact-Language-Models-via-Pruning-and-Knowledge-Distillation
Unofficial implementation of https://arxiv.org/pdf/2407.14679
☆50Updated last year
Digitous / LLM-SLERP-Merge
Spherical Merge Pytorch/HF format Language Models with minimal feature loss.
☆139Updated 2 years ago
ShiZhengyan / InstructionModelling
[NeurIPS 2024 Main Track] Code for the paper titled "Instruction Tuning With Loss Over Instructions"
☆39Updated last year
lucidrains / CALM-pytorch
Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind
☆177Updated last year
locuslab / scaling_laws_data_filtering
☆65Updated last year
HanGuo97 / lq-lora
☆127Updated last year
kamanphoebe / Look-into-MoEs
[NAACL 2025] A Closer Look into Mixture-of-Experts in Large Language Models
☆55Updated 8 months ago
FasterDecoding / BitDelta
☆202Updated 10 months ago
salesforce / summary-of-a-haystack
Codebase accompanying the Summary of a Haystack paper.
☆79Updated last year
wuhy68 / Parameter-Efficient-MoE
Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks (EMNLP'24)
☆147Updated last year
mistralai / mistral-evals
☆78Updated 2 months ago
melisa-writer / short-transformers
Prune transformer layers
☆69Updated last year
RobertCsordas / moe_attention
Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"
☆100Updated last year
YuchuanTian / RethinkTinyLM
[ICML'24] The official implementation of “Rethinking Optimization and Architecture for Tiny Language Models”
☆124Updated 9 months ago
SalesforceAIResearch / GemFilter
☆85Updated 9 months ago
srush / LLM-Talk
☆52Updated last year
mengxiayu / LLMSuperWeight
Code for studying the super weight in LLM
☆119Updated 10 months ago
john-hewitt / implicit-ins
Codebase for Instruction Following without Instruction Tuning
☆36Updated last year
IST-DASLab / SparseFinetuning
Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry
☆42Updated last year