☆23Dec 11, 2024Updated last year
Alternatives and similar repositories for MoE-LPR
Users that are interested in MoE-LPR are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆12Feb 16, 2024Updated 2 years ago
- ☆46Sep 27, 2025Updated 6 months ago
- 使用torch.distributed实现DP/TP/PP☆13Dec 28, 2023Updated 2 years ago
- ☆17Dec 21, 2023Updated 2 years ago
- [COLM 2025] "C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing"☆20Apr 9, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- A user-friendly & efficient knowledge distillation framework for LLMs, supporting off-policy, on-policy (OPD), cross-tokenizer, multimoda…☆83Updated this week
- An unofficial PyTorch implementation of Mix-Phoneme-Bert☆40Jul 10, 2023Updated 2 years ago
- ☆29Mar 13, 2026Updated last month
- TBD☆53Mar 13, 2026Updated last month
- Learnable Global Pooling Layers Based on Regularized Optimal Transport (ROT)☆16Mar 17, 2024Updated 2 years ago
- ☆19May 2, 2024Updated last year
- [ICML 2025 Oral] Mixture of Lookup Experts☆72Dec 3, 2025Updated 4 months ago
- Source code for Noise-Contrastive Estimation for Multivariate Point Processes (NeurIPS 2020).☆15Nov 3, 2020Updated 5 years ago
- An Open-Source Knowledge-Enhanced Multilingual Supervised Fine-tuning Dataset☆28Jan 19, 2025Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- [ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal…☆56Feb 28, 2023Updated 3 years ago
- offical code for Dense-TSNet☆12Sep 17, 2024Updated last year
- A toolkit implementing advanced methods to transfer models and model knowledge across tokenizers.☆66Jul 6, 2025Updated 9 months ago
- ☆57Dec 27, 2025Updated 3 months ago
- VI-SVC model is just VITS without MAS and DurationPredictor.☆10Nov 9, 2023Updated 2 years ago
- semantic tokenizer for speech and music☆21Jul 6, 2025Updated 9 months ago
- Official PyTorch implementation of CD-MOE☆12Mar 18, 2026Updated 3 weeks ago
- singing voice conversion based on glow-tts☆12Aug 20, 2023Updated 2 years ago
- 4G GPU & 10 Minutes for train☆12Aug 9, 2023Updated 2 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- The official code for our EMNLP 2022 long paper [Breaking the Representation Bottleneck of Chinese Characters: Neural Machine Translation…☆26Sep 10, 2025Updated 7 months ago
- an easy-to-use knn-mt toolkit☆104Aug 19, 2023Updated 2 years ago
- Multilingual Large Language Models Evaluation Benchmark☆132Aug 21, 2024Updated last year
- Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]☆149Oct 27, 2024Updated last year
- This is a project of Interspeech2021 paper "SpecMix : A Mixed Sample Data Augmentation method for Training with Time-Frequency Domain Fea…☆11Sep 27, 2022Updated 3 years ago
- Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge☆116Apr 1, 2026Updated last week
- ☆163Feb 15, 2025Updated last year
- [EMNLP 2025] LightThinker: Thinking Step-by-Step Compression☆143Apr 7, 2026Updated last week
- [WIP]Direction based Multi-Channel Speech Separation☆14Jan 25, 2024Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Reference implementation of models from Nyonic Model Factory☆12May 13, 2024Updated last year
- An unofficial non-causal Tensorflow implementation of "Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Spee…☆14Dec 27, 2022Updated 3 years ago
- The official implementation for MTLoRA: A Low-Rank Adaptation Approach for Efficient Multi-Task Learning (CVPR '24)☆70Jul 3, 2025Updated 9 months ago
- An unofficial code reproduction of Channel Attention Dense U-Net for Multichannel Speech Enhancement☆13Jul 17, 2023Updated 2 years ago
- ☆14Jul 23, 2024Updated last year
- The source code for the paper CrossSinger (asru2023)☆18Oct 12, 2023Updated 2 years ago
- [ICML2025] The official implementation of "WGFormer: An SE(3)-Transformer Driven by Wasserstein Gradient Flows for Molecular Ground-State…☆33Mar 18, 2026Updated 3 weeks ago