[ICML 2024] Code for the paper "MoE-RBench: Towards Building Reliable Language Models with Sparse Mixture-of-Experts"
☆10Jul 1, 2024Updated last year
Alternatives and similar repositories for MoE-RBench
Users that are interested in MoE-RBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆21May 2, 2025Updated 10 months ago
- [ECCV 2024] Code for the paper "Mew: Multiplexed Immunofluorescence Image Analysis through an Efficient Multiplex Network"☆17Jul 27, 2024Updated last year
- 🍼 Official implementation of Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts☆41Sep 29, 2024Updated last year
- [NAACL'25 🏆 SAC Award] Official code for "Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert…☆16Feb 4, 2025Updated last year
- ☆15Oct 19, 2024Updated last year
- Open-Pandora: On-the-fly Control Video Generation☆35Nov 28, 2024Updated last year
- Official code for the paper "Examining Post-Training Quantization for Mixture-of-Experts: A Benchmark"☆30Jun 30, 2025Updated 8 months ago
- code for Learning the Unlearned: Mitigating Feature Suppression in Contrastive Learning☆20Jul 16, 2024Updated last year
- The officalimplement of dLLM-Factory☆26Jul 12, 2025Updated 8 months ago
- Official Repository of "Taming Masked Diffusion Language Models via Consistency Trajectory Reinforcement Learning with Fewer Decoding Ste…☆27Mar 9, 2026Updated 2 weeks ago
- Code and data for "Timo: Towards Better Temporal Reasoning for Language Models" (COLM 2024)☆25Oct 23, 2024Updated last year
- Source code for "SimCKP: Simple Contrastive Learning of Keyphrase Representations", Findings of EMNLP 2023☆12Jun 20, 2025Updated 9 months ago
- Transformers components but in Triton☆34May 9, 2025Updated 10 months ago
- Code for the ICLR'24 paper: MT-RANKER : Reference-free machine translation evaluation by inter-system ranking☆10Feb 29, 2024Updated 2 years ago
- ☆11Jun 28, 2024Updated last year
- [ICML 2023] Meta-SAGE: Scale Meta-Learning Scheduled Adaptation with Guided Exploration for Mitigating Scale Shift on Combinatorial Optim…☆10Dec 19, 2023Updated 2 years ago
- Answering Ambiguous Questions via Iterative Prompting☆14May 25, 2024Updated last year
- ☆23Feb 3, 2026Updated last month
- ✈️ [ICCV 2025] Towards Stabilized and Efficient Diffusion Transformers through Long-Skip-Connections with Spectral Constraints☆79Jul 10, 2025Updated 8 months ago
- The official source code for Self-Guided Robust Graph Structure Refinement (SG-GSR) at WWW 2024 Research Track.☆17Apr 23, 2024Updated last year
- ☆30Sep 28, 2023Updated 2 years ago
- ☆14Oct 6, 2025Updated 5 months ago
- ☆18Nov 10, 2024Updated last year
- ☆12Dec 4, 2023Updated 2 years ago
- The official source code for "Subgraph Federated Learning for Local Generalization (FedLoG)" at ICLR 2025 (Oral).☆15May 6, 2025Updated 10 months ago
- ☆21Jun 4, 2024Updated last year
- The official code of "PixelWorld: Towards Perceiving Everything as Pixels" [TMLR25]☆16Sep 12, 2025Updated 6 months ago
- ☆17Feb 4, 2025Updated last year
- Code and data for "Living in the Moment: Can Large Language Models Grasp Co-Temporal Reasoning?" (ACL 2024)☆32Jul 3, 2024Updated last year
- 上海交通大学2020春研究生的部分课程作业整理☆16Jun 14, 2020Updated 5 years ago
- 基于PC-DDSP和nsf-HiFiGAN的声码器☆18Jul 17, 2023Updated 2 years ago
- UESTC 2020级在读本科生,整理了一些学习笔记,希望能够帮助到学弟学妹们❤☆14Sep 18, 2023Updated 2 years ago
- ☆11Sep 1, 2024Updated last year
- Code for NeurIPS 2022 Spotlight paper " Non-Monotonic Latent Alignments for CTC-Based Non-Autoregressive Machine Translation"☆20Nov 16, 2022Updated 3 years ago
- Code for "Domain Adaptive Meta-learning for Dialogue State Tracking"(TASLP2021)☆10Sep 14, 2021Updated 4 years ago
- Elixir: Train a Large Language Model on a Small GPU Cluster☆15Jun 8, 2023Updated 2 years ago
- The official source code for "Single-cell RNA-seq data imputation using Feature Propagation", accepted at 2023 ICML Workshop on Computati…☆12Aug 31, 2023Updated 2 years ago
- Code for "Exploiting reverse target-side contexts for neural machine translation via asynchronous bidirectional decoding" (Artificial Int…☆11Dec 27, 2022Updated 3 years ago
- Source code of ACL 2023 Main Conference Paper "PAD-Net: An Efficient Framework for Dynamic Networks".☆11Feb 28, 2026Updated 3 weeks ago