MNoorFawi / curloraLinks
The code repository for the CURLoRA research paper. Stable LLM continual fine-tuning and catastrophic forgetting mitigation.
☆53Updated last year
Alternatives and similar repositories for curlora
Users that are interested in curlora are comparing it to the libraries listed below
Sorting:
- A repository for research on medium sized language models.☆78Updated last year
- This is the official repository for Inheritune.☆115Updated 10 months ago
- ☆70Updated last year
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆173Updated 10 months ago
- This is the official repository for the paper "Flora: Low-Rank Adapters Are Secretly Gradient Compressors" in ICML 2024.☆105Updated last year
- This repository contains the code for the paper: SirLLM: Streaming Infinite Retentive LLM☆60Updated last year
- ☆89Updated last year
- EvaByte: Efficient Byte-level Language Models at Scale☆111Updated 7 months ago
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆37Updated 2 months ago
- ☆26Updated 11 months ago
- PyTorch implementation of models from the Zamba2 series.☆186Updated 10 months ago
- ☆136Updated last year
- Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)☆35Updated 9 months ago
- ☆55Updated last year
- [ICML 2025] From Low Rank Gradient Subspace Stabilization to Low-Rank Weights: Observations, Theories and Applications☆51Updated last month
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆103Updated 6 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆60Updated last year
- An automated data pipeline scaling RL to pretraining levels☆72Updated 2 months ago
- Official implementation of Regularized Policy Gradient (RPG) (https://arxiv.org/abs/2505.17508)☆54Updated last month
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks (EMNLP'24)☆148Updated last year
- ☆63Updated last year
- Training-free Post-training Efficient Sub-quadratic Complexity Attention. Implemented with OpenAI Triton.☆148Updated last month
- Collection of autoregressive model implementation☆85Updated 7 months ago
- The official repo for “Unleashing the Reasoning Potential of Pre-trained LLMs by Critique Fine-Tuning on One Problem” [EMNLP25]☆33Updated 3 months ago
- Official implementation of ECCV24 paper: POA☆24Updated last year
- ☆81Updated last year
- Unofficial Implementation of Evolutionary Model Merging☆41Updated last year
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆59Updated last month
- Pytorch implementation of the PEER block from the paper, Mixture of A Million Experts, by Xu Owen He at Deepmind☆132Updated last month
- Mask-Enhanced Autoregressive Prediction: Pay Less Attention to Learn More☆33Updated 6 months ago