ducdauge / sft-llmView external linksLinks
Scaling Sparse Fine-Tuning to Large Language Models
☆18Jan 31, 2024Updated 2 years ago
Alternatives and similar repositories for sft-llm
Users that are interested in sft-llm are comparing it to the libraries listed below
Sorting:
- ☆20Jul 5, 2024Updated last year
- ☆15Sep 24, 2023Updated 2 years ago
- Sequence-level 1F1B schedule for LLMs.☆19Jun 4, 2024Updated last year
- ☆35Mar 25, 2024Updated last year
- ☆23Jan 27, 2025Updated last year
- Code for the paper "Function-Space Learning Rates"☆25Jun 3, 2025Updated 8 months ago
- Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"☆48Jan 17, 2024Updated 2 years ago
- ☆25Oct 31, 2024Updated last year
- ☆24Sep 27, 2022Updated 3 years ago
- AFPQ code implementation☆23Nov 6, 2023Updated 2 years ago
- [ICLR‘24 Spotlight] Code for the paper "Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy"☆103Jun 20, 2025Updated 7 months ago
- Code for Principal Masked Autoencoders☆30Feb 4, 2026Updated last week
- Code for the AAAI 2024 Oral paper "OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Model…☆68Mar 7, 2024Updated last year
- A collection of utilities for handling IPA phones.☆26Sep 24, 2023Updated 2 years ago
- My Implementation of Q-Sparse: All Large Language Models can be Fully Sparsely-Activated☆33Aug 14, 2024Updated last year
- [TMLR] Official PyTorch implementation of paper "Efficient Quantization-aware Training with Adaptive Coreset Selection"☆37Aug 20, 2024Updated last year
- Official implementation of the ICLR 2024 paper AffineQuant☆28Mar 30, 2024Updated last year
- Official Implementation of APB (ACL 2025 main Oral) and Spava.☆32Jan 30, 2026Updated 2 weeks ago
- An unofficial implementation of the Infini-gram model proposed by Liu et al. (2024)☆33Jun 19, 2024Updated last year
- ☆40Mar 28, 2024Updated last year
- ☆34May 14, 2025Updated 8 months ago
- ☆75Nov 19, 2022Updated 3 years ago
- A python algorithm to change the pitch of the voice in real time☆13Dec 13, 2020Updated 5 years ago
- [ICML 2024] Official Repository for the paper "Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models"☆10Jul 19, 2024Updated last year
- [EMNLP 2024] RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization☆37Sep 24, 2024Updated last year
- Code for reproducing our paper "Not All Language Model Features Are Linear"☆83Nov 27, 2024Updated last year
- ParaAntiProt: Paratope Prediction Using Antibody and Protein Language Models☆10Jul 16, 2024Updated last year
- Martingale posterior neural networks for fast sequential decision making @ Neurips 2025☆22Nov 13, 2025Updated 3 months ago
- Fine-tuning Galactica and Gemma to operate on SMILES. Integrates into a molecular optimization algorithm.☆36Feb 20, 2025Updated 11 months ago
- PaCE: Parsimonious Concept Engineering for Large Language Models (NeurIPS 2024)☆42Jan 18, 2026Updated 3 weeks ago
- Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry☆42Jan 15, 2024Updated 2 years ago
- ☆131May 29, 2025Updated 8 months ago
- ☆14Mar 20, 2025Updated 10 months ago
- (READ ONLY MIRROR) The ProB Model Checker and Animator Plugin for Rodin☆19Jan 24, 2026Updated 2 weeks ago
- ☆16Jul 23, 2023Updated 2 years ago
- ☆10Nov 17, 2022Updated 3 years ago
- ☆16Feb 22, 2025Updated 11 months ago
- ☆46Nov 8, 2024Updated last year
- [NAACL 24 Oral] LoRETTA: Low-Rank Economic Tensor-Train Adaptation for Ultra-Low-Parameter Fine-Tuning of Large Language Models☆39Jan 9, 2025Updated last year