Scaling Sparse Fine-Tuning to Large Language Models
☆18Jan 31, 2024Updated 2 years ago
Alternatives and similar repositories for sft-llm
Users that are interested in sft-llm are comparing it to the libraries listed below
Sorting:
- ☆20Jul 5, 2024Updated last year
- ☆30Jul 22, 2024Updated last year
- ☆15Sep 24, 2023Updated 2 years ago
- Sequence-level 1F1B schedule for LLMs.☆19Jun 4, 2024Updated last year
- ☆35Mar 25, 2024Updated last year
- ☆23Jan 27, 2025Updated last year
- Code for the paper "Function-Space Learning Rates"☆25Jun 3, 2025Updated 9 months ago
- The open-source materials for paper "Sparsing Law: Towards Large Language Models with Greater Activation Sparsity".☆30Nov 12, 2024Updated last year
- Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"☆48Jan 17, 2024Updated 2 years ago
- ☆25Oct 31, 2024Updated last year
- ☆24Sep 27, 2022Updated 3 years ago
- MiSS is a novel PEFT method that features a low-rank structure but introduces a new update mechanism distinct from LoRA, achieving an exc…☆31Jan 28, 2026Updated last month
- AFPQ code implementation☆23Nov 6, 2023Updated 2 years ago
- [ICLR‘24 Spotlight] Code for the paper "Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy"☆102Jun 20, 2025Updated 8 months ago
- Code for Principal Masked Autoencoders☆30Feb 4, 2026Updated last month
- A collection of utilities for handling IPA phones.☆26Sep 24, 2023Updated 2 years ago
- Code for the AAAI 2024 Oral paper "OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Model…☆69Mar 7, 2024Updated last year
- [AAAI 2024] MELO: Enhancing Model Editing with Neuron-indexed Dynamic LoRA☆27Apr 9, 2024Updated last year
- My Implementation of Q-Sparse: All Large Language Models can be Fully Sparsely-Activated☆33Aug 14, 2024Updated last year
- Conic10K: A large-scale dataset for closed-vocabulary math problem understanding. Accepted to EMNLP2023 Findings.☆31Dec 6, 2023Updated 2 years ago
- ☆129Jan 22, 2024Updated 2 years ago
- Official implementation of the ICLR 2024 paper AffineQuant☆28Mar 30, 2024Updated last year
- [TMLR] Official PyTorch implementation of paper "Efficient Quantization-aware Training with Adaptive Coreset Selection"☆37Aug 20, 2024Updated last year
- Official Implementation of APB (ACL 2025 main Oral) and Spava.☆34Jan 30, 2026Updated last month
- An unofficial implementation of the Infini-gram model proposed by Liu et al. (2024)☆33Jun 19, 2024Updated last year
- ☆34May 14, 2025Updated 9 months ago
- ☆75Nov 19, 2022Updated 3 years ago
- A python algorithm to change the pitch of the voice in real time☆13Dec 13, 2020Updated 5 years ago
- ParaAntiProt: Paratope Prediction Using Antibody and Protein Language Models☆10Jul 16, 2024Updated last year
- Martingale posterior neural networks for fast sequential decision making @ Neurips 2025☆23Nov 13, 2025Updated 3 months ago
- Code for reproducing our paper "Not All Language Model Features Are Linear"☆84Nov 27, 2024Updated last year
- [ICML 2024] Official Repository for the paper "Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models"☆10Jul 19, 2024Updated last year
- Bayesian Low-Rank Adaptation for Large Language Models☆37Jun 22, 2024Updated last year
- [EMNLP 2024] RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization☆38Sep 24, 2024Updated last year
- Fine-tuning Galactica and Gemma to operate on SMILES. Integrates into a molecular optimization algorithm.☆36Feb 20, 2025Updated last year
- Multi-Candidate Speculative Decoding☆39Apr 22, 2024Updated last year
- Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry☆42Jan 15, 2024Updated 2 years ago
- PaCE: Parsimonious Concept Engineering for Large Language Models (NeurIPS 2024)☆42Jan 18, 2026Updated last month
- CVPR 2023: PAniC-3D, Vtubers dataset downloader☆13Apr 22, 2023Updated 2 years ago