Community Implementation of the paper: "Multi-Head Mixture-of-Experts" In PyTorch
☆29Mar 22, 2026Updated this week
Alternatives and similar repositories for MHMoE
Users that are interested in MHMoE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implementation of VisionLLaMA from the paper: "VisionLLaMA: A Unified LLaMA Interface for Vision Tasks" in PyTorch and Zeta☆16Nov 11, 2024Updated last year
- Implementation of MoE Mamba from the paper: "MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts" in Pytorch and Ze…☆124Updated this week
- Simple Implementation of TinyGPTV in super simple Zeta lego blocks☆16Nov 11, 2024Updated last year
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆29Mar 16, 2026Updated last week
- OmniByteFormer is a generalized Transformer model that can process any type of data by converting it into byte sequences, bypassing tradi…☆15Mar 16, 2026Updated last week
- Implementation of xLSTM in Pytorch from the paper: "xLSTM: Extended Long Short-Term Memory"☆118Feb 6, 2026Updated last month
- Paper dataset for "Factored Verification: Detecting and Reducing Hallucination in Summaries of Academic Papers"☆13Oct 20, 2024Updated last year
- Deft: A Scalable Tree Index for Disaggregated Memory☆23Apr 23, 2025Updated 11 months ago
- A curated collection of prompts for Grok Imagine by xAI☆25Oct 19, 2025Updated 5 months ago
- Implementation of the Pairformer model used in AlphaFold 3☆14Mar 16, 2026Updated last week
- An implementation of the base GPT-3 Model architecture from the paper by OPENAI "Language Models are Few-Shot Learners"☆20Jun 29, 2024Updated last year
- This is an read-only mirror of the gem5 simulator. The upstream repository is stored in https://gem5.googlesource.com, code reviews shoul…☆13May 15, 2020Updated 5 years ago
- [NeurIPS D&B'24]Enhancing vision-language models for medical imaging: bridging the 3D gap with innovative slice selection☆21Nov 25, 2024Updated last year
- Deferred Continuous Batching in Resource-Efficient Large Language Model Serving (EuroMLSys 2024)☆19May 28, 2024Updated last year
- ☆29Oct 9, 2024Updated last year
- An unofficial implementation of "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆36Jun 7, 2024Updated last year
- Data manipulation and transformation for audio signal processing, powered by PyTorch☆10Sep 30, 2024Updated last year
- Community Open Source Implementation of GPT4o in PyTorch☆26Updated this week
- PyTorch Implementation of the paper "MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training"☆27Mar 9, 2026Updated 2 weeks ago
- Simple Implementation of a Transformer in the new framework MLX by Apple☆19Nov 18, 2024Updated last year
- DAR introduces the diagonal scanning order for next-token prediction and proposes a direction-aware autoregressive transformer framework.☆18Apr 16, 2025Updated 11 months ago
- Tool to generate documentation for Nelua source files.☆10Dec 11, 2021Updated 4 years ago
- Train toy models using multi-token prediction objective☆14May 8, 2024Updated last year
- https://x.com/BlinkDL_AI/status/1884768989743882276☆28May 4, 2025Updated 10 months ago
- Implementation of Liquid Nets in Pytorch☆70Jan 31, 2026Updated last month
- ☆12Dec 14, 2024Updated last year
- a simplified version of Meta's Llama 3 model to be used for learning☆44May 21, 2024Updated last year
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.☆16May 16, 2025Updated 10 months ago
- Hpyformer base FunASR☆30Nov 5, 2024Updated last year
- CosyVoice语音合成简易API☆14Nov 1, 2024Updated last year
- This is the implementation repository of our SOSP'24 paper: CHIME: A Cache-Efficient and High-Performance Hybrid Index on Disaggregated M…☆28Nov 7, 2024Updated last year
- Focus handling and navigation library with React integration. This is a read-only mirror.☆15Dec 19, 2024Updated last year
- Implementation of the premier Text to Video model from OpenAI☆56Nov 11, 2024Updated last year
- ☆11Sep 18, 2023Updated 2 years ago
- SpeechPlus: Small LLM-Based Text-to-Speech Library 🚀☆20May 20, 2025Updated 10 months ago
- RWKV6 in native pytorch and triton:)☆11Aug 4, 2024Updated last year
- ☆10Feb 21, 2023Updated 3 years ago
- ☆12Jul 11, 2024Updated last year
- ☆14Aug 9, 2021Updated 4 years ago