Bumble666 / Hyper_MoE
☆15Updated 3 months ago
Related projects: ⓘ
- ☆22Updated last month
- MoCLE (First MLLM with MoE for instruction customization and generalization!) (https://arxiv.org/abs/2312.12379)☆28Updated 5 months ago
- Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation, ICML 2024☆20Updated 2 months ago
- One Network, Many Masks: Towards More Parameter-Efficient Transfer Learning☆36Updated last year
- Codebase for ACL 2023 paper "Mixture-of-Domain-Adapters: Decoupling and Injecting Domain Knowledge to Pre-trained Language Models' Memori…☆44Updated 11 months ago
- code for ACL24 "MELoRA: Mini-Ensemble Low-Rank Adapter for Parameter-Efficient Fine-Tuning"☆12Updated 4 months ago
- ☆25Updated last year
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆28Updated 8 months ago
- [EVA ICLR'23; LARA ICML'22] Efficient attention mechanisms via control variates, random features, and importance sampling☆78Updated last year
- [AAAI 2024] MELO: Enhancing Model Editing with Neuron-indexed Dynamic LoRA☆21Updated 5 months ago
- Code for paper "UniPELT: A Unified Framework for Parameter-Efficient Language Model Tuning", ACL 2022☆58Updated 2 years ago
- ☆11Updated 2 months ago
- The source code of the EMNLP 2023 main conference paper: Sparse Low-rank Adaptation of Pre-trained Language Models.☆62Updated 6 months ago
- ☆110Updated last month
- [ICML'2024] Can AI Assistants Know What They Don't Know?☆62Updated 7 months ago
- [NeurIPS 2023] Github repository for "Composing Parameter-Efficient Modules with Arithmetic Operations"☆54Updated 9 months ago
- [ICLR 2024] This is the repository for the paper titled "DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning"☆90Updated 5 months ago
- ☆42Updated 5 months ago
- ☆38Updated last week
- a benckmark for evaluating logical reasoning of LLMs☆16Updated 7 months ago
- A curated list of awesome resources dedicated to Scaling Laws for LLMs☆61Updated last year
- The source code of "Merging Experts into One: Improving Computational Efficiency of Mixture of Experts (EMNLP 2023)":☆31Updated 5 months ago
- ☆31Updated 3 months ago
- [ICLR 2024]EMO: Earth Mover Distance Optimization for Auto-Regressive Language Modeling(https://arxiv.org/abs/2310.04691)☆111Updated 6 months ago
- ☆18Updated 3 months ago
- Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"☆65Updated 6 months ago
- Text Diffusion Model with Encoder-Decoder Transformers for Sequence-to-Sequence Generation [NAACL 2024]☆86Updated last year
- Codes for Merging Large Language Models☆16Updated last month
- PyTorch implementation of StableMask (ICML'24)☆11Updated 2 months ago
- Official Code for the paper "SuRe: Summarizing Retrievals using Answer Candidates for Open-domain QA of LLMs" (ICLR 2024)☆16Updated 4 months ago