pratyushasharma / laser
View external linksLinks

The Truth Is In There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction

☆390

Alternatives and similar repositories for laser

Users that are interested in laser are comparing it to the libraries listed below

Sorting:

QuixiAI / laserRMT
View on GitHub
This is our own implementation of 'Layer Selective Rank Reduction'
☆240May 26, 2024Updated last year
thomasgauthier / LoRD
View on GitHub
Low-Rank adapter extraction for fine-tuned transformers models
☆180May 2, 2024Updated last year
princeton-nlp / LLM-Shearing
View on GitHub
[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
☆640Mar 4, 2024Updated last year
arcee-ai / mergekit
View on GitHub
Tools for merging pretrained large language models.
☆6,783Jan 26, 2026Updated 2 weeks ago
FasterDecoding / Medusa
View on GitHub
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
☆2,705Jun 25, 2024Updated last year
HanGuo97 / lq-lora
View on GitHub
☆129Jan 22, 2024Updated 2 years ago
jiaweizzhao / GaLore
View on GitHub
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
☆1,672Oct 28, 2024Updated last year
KyujinHan / Sakura-SOLAR-DPO
View on GitHub
Sakura-SOLAR-DPO: Merge, SFT, and DPO
☆116Dec 30, 2023Updated 2 years ago
sdan / selfextend
View on GitHub
an implementation of Self-Extend, to expand the context window via grouped attention
☆119Jan 7, 2024Updated 2 years ago
hao-ai-lab / LookaheadDecoding
View on GitHub
[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
☆1,316Mar 6, 2025Updated 11 months ago
IST-DASLab / qmoe
View on GitHub
Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".
☆279Nov 3, 2023Updated 2 years ago
yule-BUAA / MergeLM
View on GitHub
Codebase for Merging Language Models (ICML 2024)
☆864May 5, 2024Updated last year
pbelcak / UltraFastBERT
View on GitHub
The repository for the code of the UltraFastBERT paper
☆518Mar 24, 2024Updated last year
austinsilveria / tricksy
View on GitHub
Fast approximate inference on a single GPU with sparsity aware offloading
☆39Jan 4, 2024Updated 2 years ago
Cornell-RelaxML / quip-sharp
View on GitHub
☆577Oct 29, 2024Updated last year
lucidrains / self-rewarding-lm-pytorch
View on GitHub
Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI
☆1,407Apr 11, 2024Updated last year
horseee / LLM-Pruner
View on GitHub
[NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baich…
☆1,105Oct 7, 2024Updated last year
uclaml / SPIN
View on GitHub
The official implementation of Self-Play Fine-Tuning (SPIN)
☆1,234May 8, 2024Updated last year
IST-DASLab / sparsegpt
View on GitHub
Code for the ICML 2023 paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot".
☆866Aug 20, 2024Updated last year
datamllab / LongLM
View on GitHub
[ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
☆667Jun 1, 2024Updated last year
OpenGVLab / OmniQuant
View on GitHub
[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
☆887Nov 26, 2025Updated 2 months ago
IST-DASLab / marlin
View on GitHub
FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.
☆1,011Sep 4, 2024Updated last year
Cornell-RelaxML / QuIP
View on GitHub
Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees"
☆396Feb 24, 2024Updated last year
huggingface / alignment-handbook
View on GitHub
Robust recipes to align language models with human and AI preferences
☆5,495Sep 8, 2025Updated 5 months ago
GeneZC / MiniMA
View on GitHub
Code for paper titled "Towards the Law of Capacity Gap in Distilling Language Models"
☆102Jul 9, 2024Updated last year
SafeAILab / EAGLE
View on GitHub
Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3 (NeurIPS'25).
☆2,180Jan 27, 2026Updated 2 weeks ago
FasterDecoding / BitDelta
View on GitHub
☆203Dec 5, 2024Updated last year
microsoft / TransformerCompression
View on GitHub
For releasing code related to compression methods for transformers, accompanying our publications
☆455Jan 16, 2025Updated last year
ericwtodd / function_vectors
View on GitHub
Function Vectors in Large Language Models (ICLR 2024)
☆191Apr 17, 2025Updated 9 months ago
HazyResearch / TART
View on GitHub
TART: A plug-and-play Transformer module for task-agnostic reasoning
☆202Jun 22, 2023Updated 2 years ago
SkunkworksAI / hydra-moe
View on GitHub
☆415Nov 2, 2023Updated 2 years ago
euclaise / SlimTrainer
View on GitHub
Full finetuning of large language models without large memory requirements
☆94Sep 22, 2025Updated 4 months ago
fblgit / model-similarity
View on GitHub
Simple Model Similarities Analysis
☆21Feb 3, 2024Updated 2 years ago
RobertCsordas / moeut
View on GitHub
☆91Aug 18, 2024Updated last year
QuixiAI / OpenChatML
View on GitHub
☆166Aug 8, 2025Updated 6 months ago
VatsaDev / NanoPhi-alpha
View on GitHub
GPT-2 small trained on phi-like data
☆68Feb 18, 2024Updated last year
kongds / MoRA
View on GitHub
MoRA: High-Rank Updating for Parameter-Efﬁcient Fine-Tuning
☆362Aug 7, 2024Updated last year
KaiNylund / lm-weights-encode-time
View on GitHub
☆68Aug 16, 2024Updated last year
teknium1 / LLM-Benchmark-Logs
View on GitHub
Just a bunch of benchmark logs for different LLMs
☆119Jul 28, 2024Updated last year

pratyushasharma / laserView external linksLinks

Alternatives and similar repositories for laser

pratyushasharma / laser
View external linksLinks