UNITES-Lab / Lingual-SMoELinks
[ICLR 2024] Code for the paper "Sparse MoE with Language-Guided Routing for Multilingual Machine Translation"
☆9Updated last year
Alternatives and similar repositories for Lingual-SMoE
Users that are interested in Lingual-SMoE are comparing it to the libraries listed below
Sorting:
- ☆29Updated last year
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"☆38Updated 7 months ago
- [ICLR 2024] This is the repository for the paper titled "DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning"☆97Updated last year
- [NeurIPS 2024 Spotlight] EMR-Merging: Tuning-Free High-Performance Model Merging☆59Updated 3 months ago
- ☆25Updated last year
- [NeurIPS 2023] Github repository for "Composing Parameter-Efficient Modules with Arithmetic Operations"☆61Updated last year
- [NeurIPS'24 Oral] HydraLoRA: An Asymmetric LoRA Architecture for Efficient Fine-Tuning☆203Updated 6 months ago
- AdaMoLE: Adaptive Mixture of LoRA Experts☆31Updated 7 months ago
- Source code of EMNLP 2022 Findings paper "SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters"☆18Updated last year
- [EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.☆74Updated 6 months ago
- ☆138Updated 10 months ago
- Code and data for "Timo: Towards Better Temporal Reasoning for Language Models" (COLM 2024)☆21Updated 7 months ago
- Official Code Repository for the paper "Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-intensive Tasks…☆38Updated 6 months ago
- ☆28Updated 4 months ago
- Official repository for paper "DeepCritic: Deliberate Critique with Large Language Models"☆24Updated 2 weeks ago
- [EMNLP 2023, Main Conference] Sparse Low-rank Adaptation of Pre-trained Language Models☆76Updated last year
- code for ACL24 "MELoRA: Mini-Ensemble Low-Rank Adapter for Parameter-Efficient Fine-Tuning"☆19Updated 3 months ago
- MoCLE (First MLLM with MoE for instruction customization and generalization!) (https://arxiv.org/abs/2312.12379)☆38Updated last year
- ✨✨The Curse of Multi-Modalities (CMM): Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio☆46Updated 3 weeks ago
- PyTorch implementation of StableMask (ICML'24)☆13Updated 11 months ago
- Code for Reducing Hallucinations in Vision-Language Models via Latent Space Steering☆57Updated 6 months ago
- Code for paper: Aligning Large Language Models with Representation Editing: A Control Perspective☆32Updated 4 months ago
- 🚀 LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training☆86Updated 6 months ago
- [ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibrati…☆38Updated 11 months ago
- ACL'2025: SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs. and preprint: SoftCoT++: Test-Time Scaling with Soft Chain-of…☆21Updated last week
- The source code of "Merging Experts into One: Improving Computational Efficiency of Mixture of Experts (EMNLP 2023)":☆38Updated last year
- Multimodal Instruction Tuning with Conditional Mixture of LoRA (ACL 2024)☆20Updated 9 months ago
- MMoE: Multimodal Mixture-of-Experts (EMNLP 2024)☆11Updated 6 months ago
- Text Diffusion Model with Encoder-Decoder Transformers for Sequence-to-Sequence Generation [NAACL 2024]☆97Updated last year
- Codes for Merging Large Language Models☆31Updated 9 months ago