UNITES-Lab / Lingual-SMoE
[ICLR 2024] Code for the paper "Sparse MoE with Language-Guided Routing for Multilingual Machine Translation"
☆8Updated 4 months ago
Related projects: ⓘ
- [NeurIPS 2023] Github repository for "Composing Parameter-Efficient Modules with Arithmetic Operations"☆54Updated 9 months ago
- Official Code Repository for the paper "Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-intensive Tasks…☆31Updated 9 months ago
- Text Diffusion Model with Encoder-Decoder Transformers for Sequence-to-Sequence Generation [NAACL 2024]☆86Updated last year
- MoCLE (First MLLM with MoE for instruction customization and generalization!) (https://arxiv.org/abs/2312.12379)☆28Updated 5 months ago
- Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation, ICML 2024☆20Updated 2 months ago
- Source code of EMNLP 2022 Findings paper "SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters"☆18Updated 5 months ago
- ICLR 2024 OpenReivew Submission Data☆131Updated 10 months ago
- ☆15Updated 3 months ago
- [ICLR 2024] This is the repository for the paper titled "DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning"☆90Updated 5 months ago
- [IJCAI'23] The official Github page of the paper "Diffusion Models for Non-autoregressive Text Generation: A Survey".☆47Updated 3 months ago
- ☆98Updated 6 months ago
- The official codebase for "Empowering Diffusion Models on the Embedding Space for Text Generation" (NAACL 2024)☆49Updated 4 months ago
- ☆110Updated last month
- ☆24Updated 11 months ago
- ☆119Updated last week
- One Network, Many Masks: Towards More Parameter-Efficient Transfer Learning☆36Updated last year
- ☆25Updated last year
- [ICLR 2024 Spotlight] Code for the paper "Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy"☆63Updated 3 months ago
- A paper list about diffusion models for natural language processing.☆170Updated last year
- Source code of LatentOps☆77Updated 10 months ago
- The source code of the EMNLP 2023 main conference paper: Sparse Low-rank Adaptation of Pre-trained Language Models.☆62Updated 6 months ago
- Code associated with Tuning Language Models by Proxy (Liu et al., 2024)☆84Updated 5 months ago
- A curated list of awesome resources dedicated to Scaling Laws for LLMs☆61Updated last year
- Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"☆65Updated 6 months ago
- On the Effectiveness of Parameter-Efficient Fine-Tuning☆38Updated 10 months ago
- ☆32Updated last year
- Code for ACL 2024 accepted paper titled "SAPT: A Shared Attention Framework for Parameter-Efficient Continual Learning of Large Language …☆19Updated last month
- Code for paper "Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning"☆59Updated 7 months ago
- A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..☆109Updated 2 weeks ago
- ACL'2023: DiffusionBERT: Improving Generative Masked Language Models with Diffusion Models☆286Updated 7 months ago