MNoorFawi / curlora
The code repository for the CURLoRA research paper. Stable LLM continual fine-tuning and catastrophic forgetting mitigation.
☆41Updated 5 months ago
Alternatives and similar repositories for curlora:
Users that are interested in curlora are comparing it to the libraries listed below
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆35Updated 9 months ago
- A repository for research on medium sized language models.☆76Updated 8 months ago
- The first dense retrieval model that can be prompted like an LM☆64Updated 4 months ago
- RWKV-7: Surpassing GPT☆76Updated 2 months ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆50Updated 10 months ago
- Official implementation of "Gemini in Reasoning: Unveiling Commonsense in Multimodal Large Language Models"☆35Updated last year
- ☆31Updated 4 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆54Updated 5 months ago
- This is the official repository for Inheritune.☆109Updated this week
- ☆62Updated 4 months ago
- The simplest, fastest repository for training/finetuning medium-sized xLSTMs.☆39Updated 8 months ago
- Official implementation of ECCV24 paper: POA☆24Updated 6 months ago
- implementation of https://arxiv.org/pdf/2312.09299☆20Updated 7 months ago
- Official repository for the paper "NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks". This rep…☆53Updated 3 months ago
- ☆44Updated 6 months ago
- Token Omission Via Attention☆122Updated 4 months ago
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated 11 months ago
- ☆49Updated 11 months ago
- Code for Paper: Harnessing Webpage Uis For Text Rich Visual Understanding☆46Updated 2 months ago
- PyTorch implementation of models from the Zamba2 series.☆176Updated 3 weeks ago
- Set of scripts to finetune LLMs☆36Updated 10 months ago
- From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients. Ajay Jaiswal, Lu Yin, Zhenyu Zhang, Shiwei Liu,…☆42Updated 6 months ago
- ☆71Updated 5 months ago
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆27Updated this week
- ☆48Updated 3 months ago
- Train, tune, and infer Bamba model☆83Updated 3 weeks ago
- ☆26Updated last month