โ19Nov 5, 2024Updated last year
Alternatives and similar repositories for MH-MoE
Users that are interested in MH-MoE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [NAACL'25 ๐ SAC Award] Official code for "Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expertโฆโ16Feb 4, 2025Updated last year
- โ18Mar 2, 2026Updated 2 months ago
- โ29Oct 9, 2024Updated last year
- โ68Dec 2, 2024Updated last year
- โ13Feb 17, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer โข AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [ICLR2025] Codebase for "ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing", built on Megatron-LM.โ113Dec 20, 2024Updated last year
- Mixture of Lora Expertsโ10Apr 7, 2024Updated 2 years ago
- โ17Nov 25, 2024Updated last year
- โ14May 31, 2023Updated 2 years ago
- A new release of Chinese sexism dataset and lexiconโ15May 23, 2023Updated 2 years ago
- ControlLM is a method to control the personality traits and behaviors of language models in real-time at inference without costly traininโฆโ20Nov 6, 2024Updated last year
- SCCD:ๅบไบไผ่ฏ็ไธญๆ็ฝ็ปๆฌบๅๆฃๆตๆฐๆฎ้โ22Mar 9, 2025Updated last year
- โ17Oct 28, 2025Updated 6 months ago
- Official PyTorch Implementation of Masked Temporal Interpolation Diffusion for Procedure Planning in Instructional Videosโ11Apr 26, 2026Updated last week
- Bare Metal GPUs on DigitalOcean Gradient AI โข AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- The code of SKSโ15Mar 22, 2022Updated 4 years ago
- โ14Mar 6, 2022Updated 4 years ago
- Public version for DistPepFoldโ10Jul 17, 2025Updated 9 months ago
- Code and data for the paper: On the Reliability of Psychological Scales on Large Language Modelsโ30Dec 15, 2025Updated 4 months ago
- This is AI implementation (not official) of the DreamGym framework from the paper "Scaling Agent Learning via Experience Synthesis" (arXiโฆโ40Nov 9, 2025Updated 6 months ago
- AdaMoLE: Adaptive Mixture of LoRA Expertsโ38Oct 11, 2024Updated last year
- โ28Dec 5, 2025Updated 5 months ago
- โ13Aug 20, 2021Updated 4 years ago
- PyTorch implementation of "From Sparse to Soft Mixtures of Experts"โ70Aug 22, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient โข AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- โ11Jun 4, 2021Updated 4 years ago
- Official Implementation of "Simulating Environments with Reasoning Models for Agent Training"โ65Feb 18, 2026Updated 2 months ago
- The repository implements the paper "Learning Graph Quantized Tokenizers for Transformers".โ31Apr 2, 2025Updated last year
- Mixture of Attention Headsโ52Oct 10, 2022Updated 3 years ago
- Convolutional variational autoencoders and text-question, emoji-answer modelsโ11Jun 19, 2017Updated 8 years ago
- A toolset and pipeline for running zero shot and supervised protein fitness prediction, drop in compatible with scikitlearnโ13Nov 28, 2025Updated 5 months ago
- โ16Mar 1, 2025Updated last year
- โ43Mar 27, 2026Updated last month
- Converts AlphaFold distograms into distance matrices and saves them into a number of formatsโ15Dec 13, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient โข AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- This repo is for the paper: On the Safety of Conversational Models: Taxonomy, Dataset, and Benchmarkโ24Aug 13, 2022Updated 3 years ago
- โ27Nov 20, 2023Updated 2 years ago
- A Winograd Minimal Filter Implementation in CUDAโ29Aug 25, 2021Updated 4 years ago
- TUnA: Transformer-based Uncertainty Aware model for PPI Predictionโ17Dec 21, 2025Updated 4 months ago
- ProtTrans is providing state of the art pretrained language models for proteins. ProtTrans was trained on thousands of GPUs from Summit aโฆโ11Jun 2, 2022Updated 3 years ago
- Efficiently apply modification functions to RLDS/TFDS datasets.โ32Jun 19, 2024Updated last year
- A template for running Stable Diffusion 3 with Cogโ14Aug 20, 2024Updated last year