GCYZSL / MoLA
☆132Updated 9 months ago
Alternatives and similar repositories for MoLA:
Users that are interested in MoLA are comparing it to the libraries listed below
- [SIGIR'24] The official implementation code of MOELoRA.☆160Updated 9 months ago
- ☆172Updated 9 months ago
- TokenSkip: Controllable Chain-of-Thought Compression in LLMs☆128Updated last month
- [ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".☆117Updated 5 months ago
- LoRAMoE: Revolutionizing Mixture of Experts for Maintaining World Knowledge in Language Model Alignment☆321Updated 11 months ago
- Code for ACL 2024 accepted paper titled "SAPT: A Shared Attention Framework for Parameter-Efficient Continual Learning of Large Language …☆34Updated 3 months ago
- State-of-the-art Parameter-Efficient MoE Fine-tuning Method☆156Updated 8 months ago
- Model merging is a highly efficient approach for long-to-short reasoning.☆42Updated 3 weeks ago
- MoCLE (First MLLM with MoE for instruction customization and generalization!) (https://arxiv.org/abs/2312.12379)☆35Updated last year
- Repository for Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning☆162Updated last year
- ☆99Updated 9 months ago
- ☆192Updated 6 months ago
- Awesome-Long2short-on-LRMs is a collection of state-of-the-art, novel, exciting long2short methods on large reasoning models. It contains…☆187Updated last week
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆65Updated 2 months ago
- [ACL 2024] Long-Context Language Modeling with Parallel Encodings☆154Updated 10 months ago
- code for ACL24 "MELoRA: Mini-Ensemble Low-Rank Adapter for Parameter-Efficient Fine-Tuning"☆19Updated 2 months ago
- [ICLR 2025] 🧬 RegMix: Data Mixture as Regression for Language Model Pre-training (Spotlight)☆127Updated 2 months ago
- Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".☆52Updated 4 months ago
- ☆185Updated 2 months ago
- ☆51Updated last week
- Official code for our paper, "LoRA-Pro: Are Low-Rank Adapters Properly Optimized? "☆112Updated 2 weeks ago
- An Efficient LLM Fine-Tuning Factory Optimized for MoE PEFT☆90Updated last month
- Reference implementation for Token-level Direct Preference Optimization(TDPO)☆136Updated 2 months ago
- ☆255Updated last year
- ☆93Updated last month
- [ICML 2024] Selecting High-Quality Data for Training Language Models☆168Updated 10 months ago
- ☆81Updated last year
- Code associated with Tuning Language Models by Proxy (Liu et al., 2024)☆108Updated last year
- ☆29Updated 4 months ago
- This repository collects awesome survey, resource, and paper for Lifelong Learning for Large Language Models. (Updated Regularly)☆46Updated 2 months ago