Community Implementation of the paper: "Multi-Head Mixture-of-Experts" In PyTorch
☆31May 11, 2026Updated last month
Alternatives and similar repositories for MHMoE
Users that are interested in MHMoE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implementation of VisionLLaMA from the paper: "VisionLLaMA: A Unified LLaMA Interface for Vision Tasks" in PyTorch and Zeta☆15Nov 11, 2024Updated last year
- Implementation of MoE Mamba from the paper: "MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts" in Pytorch and Ze…☆130May 12, 2026Updated last month
- Simple Implementation of TinyGPTV in super simple Zeta lego blocks☆16Nov 11, 2024Updated last year
- OmniByteFormer is a generalized Transformer model that can process any type of data by converting it into byte sequences, bypassing tradi…☆16May 25, 2026Updated 2 weeks ago
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆31May 19, 2026Updated 3 weeks ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Implementation of xLSTM in Pytorch from the paper: "xLSTM: Extended Long Short-Term Memory"☆118May 11, 2026Updated last month
- Paper dataset for "Factored Verification: Detecting and Reducing Hallucination in Summaries of Academic Papers"☆13Oct 20, 2024Updated last year
- Conformer RNN-Transducer☆14May 25, 2022Updated 4 years ago
- a simplified version of Google's Gemma model to be used for learning☆26Mar 2, 2024Updated 2 years ago
- Implementation of the Pairformer model used in AlphaFold 3☆14May 25, 2026Updated 2 weeks ago
- An implementation of the base GPT-3 Model architecture from the paper by OPENAI "Language Models are Few-Shot Learners"☆21Jun 29, 2024Updated last year
- Implementaion RNN tranceducer☆23Jun 25, 2019Updated 6 years ago
- A curated collection of prompts for Grok Imagine by xAI☆30Jun 6, 2026Updated last week
- Deferred Continuous Batching in Resource-Efficient Large Language Model Serving (EuroMLSys 2024)☆19May 28, 2024Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- An unofficial implementation of "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆36Jun 7, 2024Updated 2 years ago
- Data manipulation and transformation for audio signal processing, powered by PyTorch☆11Sep 30, 2024Updated last year
- Community Open Source Implementation of GPT4o in PyTorch☆32Jun 6, 2026Updated last week
- PyTorch Implementation of the paper "MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training"☆28Updated this week
- Simple Implementation of a Transformer in the new framework MLX by Apple☆19Nov 18, 2024Updated last year
- One command to start a streaming ASR server.☆12Oct 2, 2024Updated last year
- Per function, Lua JIT using LLVM C++ toolchain☆10Jun 8, 2017Updated 9 years ago
- Train toy models using multi-token prediction objective☆14Apr 18, 2026Updated last month
- https://x.com/BlinkDL_AI/status/1884768989743882276☆28May 4, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Compute WER and SER for speech recognition evaluation☆26Jun 6, 2026Updated last week
- Implementation of Liquid Nets in Pytorch☆71May 12, 2026Updated last month
- a WIP architecture designed to allow transformers to think in a manner without tokens☆20Apr 12, 2024Updated 2 years ago
- ☆12Dec 14, 2024Updated last year
- This repository contains the code for UNETR: Transformers for 3D Medical Image Segmentation [1]. UNETR is the first 3D segmentation netwo…☆16Jul 8, 2022Updated 3 years ago
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.☆17May 16, 2025Updated last year
- Hpyformer base FunASR☆31Nov 5, 2024Updated last year
- A simple and concise templating engine that takes advantage of elegant Lua syntax.☆11Nov 25, 2023Updated 2 years ago
- ☆11Dec 24, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Focus handling and navigation library with React integration. This is a read-only mirror.☆15Dec 19, 2024Updated last year
- Implementation of the premier Text to Video model from OpenAI☆57Nov 11, 2024Updated last year
- 这是基于FunASR实现的区分说话人语音识别API | This is a speaker-diarization-based speech recognition API implemented using FunASR.☆26Feb 12, 2026Updated 4 months ago
- ☆12Jul 11, 2024Updated last year
- Exploration into the Firefly algorithm in Pytorch☆41Feb 14, 2025Updated last year
- ☆10Feb 21, 2023Updated 3 years ago
- ☆14Aug 9, 2021Updated 4 years ago