Community Implementation of the paper: "Multi-Head Mixture-of-Experts" In PyTorch
☆31Jun 22, 2026Updated last week
Alternatives and similar repositories for MHMoE
Users that are interested in MHMoE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implementation of MoE Mamba from the paper: "MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts" in Pytorch and Ze…☆132Jun 22, 2026Updated last week
- OmniByteFormer is a generalized Transformer model that can process any type of data by converting it into byte sequences, bypassing tradi…☆16Jun 22, 2026Updated last week
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆31Jun 22, 2026Updated last week
- Implementation of xLSTM in Pytorch from the paper: "xLSTM: Extended Long Short-Term Memory"☆118Jun 22, 2026Updated last week
- Train a production grade GPT in less than 400 lines of code. Better than Karpathy's verison and GIGAGPT☆17Jun 22, 2026Updated last week
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Paper dataset for "Factored Verification: Detecting and Reducing Hallucination in Summaries of Academic Papers"☆13Oct 20, 2024Updated last year
- Conformer RNN-Transducer☆14May 25, 2022Updated 4 years ago
- An implementation of the base GPT-3 Model architecture from the paper by OPENAI "Language Models are Few-Shot Learners"☆22Jun 29, 2024Updated 2 years ago
- A curated collection of prompts for Grok Imagine by xAI☆32Jun 6, 2026Updated 3 weeks ago
- ☆29Oct 9, 2024Updated last year
- [MICCAI2023] XSurv: Merging-Diverging Hybrid Transformer Networks for Survival Prediction☆12Oct 2, 2023Updated 2 years ago
- [NeurIPS D&B'24]Enhancing vision-language models for medical imaging: bridging the 3D gap with innovative slice selection☆24Mar 25, 2026Updated 3 months ago
- Community Open Source Implementation of GPT4o in PyTorch☆32Jun 22, 2026Updated last week
- PyTorch Implementation of the paper "MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training"☆28Updated this week
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Simple Implementation of a Transformer in the new framework MLX by Apple☆19Nov 18, 2024Updated last year
- One command to start a streaming ASR server.☆12Oct 2, 2024Updated last year
- DAR introduces the diagonal scanning order for next-token prediction and proposes a direction-aware autoregressive transformer framework.☆19Apr 16, 2025Updated last year
- Per function, Lua JIT using LLVM C++ toolchain☆10Jun 8, 2017Updated 9 years ago
- Train toy models using multi-token prediction objective☆14Apr 18, 2026Updated 2 months ago
- https://x.com/BlinkDL_AI/status/1884768989743882276☆28May 4, 2025Updated last year
- FreeSWITCH ASR module fork from mod_audio_stream, use FunASR online cpu version☆18Jun 27, 2025Updated last year
- Implementation of Liquid Nets in Pytorch☆71Jun 22, 2026Updated last week
- a WIP architecture designed to allow transformers to think in a manner without tokens☆20Apr 12, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆12Dec 14, 2024Updated last year
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.☆17May 16, 2025Updated last year
- funasr语音转文字的简单api版本,funasr+fastapi,方便部署在服务器上☆13Aug 10, 2024Updated last year
- Multi-modal approach for tumor segmentation and survival prediction using PET/CT imaging with attention mechanisms (MICCAI2021 HECKTOR Ch…☆12Apr 22, 2022Updated 4 years ago
- ☆11Dec 24, 2024Updated last year
- Wind visualization over time☆102Oct 23, 2025Updated 8 months ago
- ☆11Sep 18, 2023Updated 2 years ago
- RWKV6 in native pytorch and triton:)☆11Aug 4, 2024Updated last year
- ☆10Feb 21, 2023Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆14Aug 9, 2021Updated 4 years ago
- 基于wenet的短时在线语音识别服务☆11Feb 25, 2023Updated 3 years ago
- ☆15Sep 23, 2022Updated 3 years ago
- ASR_LLM_TTS前端项目☆15Dec 3, 2024Updated last year
- ☆14Jan 22, 2025Updated last year
- A simple WebAssembly Linker in JavaScript☆17Jun 15, 2021Updated 5 years ago
- FunASR安卓端侧离线版本2pass全模式☆15Sep 4, 2023Updated 2 years ago