zms1999/SmartMoE

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/zms1999/SmartMoE)

zms1999 / SmartMoE

A MoE impl for PyTorch, [ATC'23] SmartMoE

☆73

Alternatives and similar repositories for SmartMoE

Users that are interested in SmartMoE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Zoeyyao27 / SirLLM
View on GitHub
This repository contains the code for the paper: SirLLM: Streaming Infinite Retentive LLM
☆60May 28, 2024Updated 2 years ago
ServiceNow / promptmix-emnlp-2023
View on GitHub
Offical code repository for PromptMix: A Class Boundary Augmentation Method for Large Language Model Distillation, EMNLP 2023
☆12Dec 13, 2023Updated 2 years ago
MayDomine / Burst-Attention
View on GitHub
Distributed IO-aware Attention algorithm
☆24Sep 24, 2025Updated 10 months ago
SoulOfScience / Guo-s_MaiRuo_Collection
View on GitHub
收录CS卷王的经典强（mai）者（ruo）语录
☆10May 16, 2021Updated 5 years ago
thu-pacman / FasterMoE
View on GitHub
☆92Apr 2, 2022Updated 4 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
YJHMITWEB / ExFlow
View on GitHub
Explore Inter-layer Expert Affinity in MoE Model Inference
☆16May 6, 2024Updated 2 years ago
d-matrix-ai / keyformer-llm
View on GitHub
Keyformer proposes KV Cache reduction through key tokens identification and without the need for fine-tuning
☆57Mar 26, 2024Updated 2 years ago
thu-pacman / lab-guide
View on GitHub
Everything about PACMAN!
☆19May 28, 2026Updated last month
RahulSChand / Weighted-low-rank-factorization-Pytorch
View on GitHub
PyTorch implementation of Language model compression with weighted low-rank factorization
☆14Jun 28, 2023Updated 3 years ago
bytedance / ByteTransformer
View on GitHub
optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052
☆479Mar 15, 2024Updated 2 years ago
GaiYu0 / QDGAT
View on GitHub
Question-Directed Graph Attention Network for Numerical Reasoning over Text
☆10Aug 14, 2020Updated 5 years ago
Cohere-Labs-Community / parameter-efficient-moe
View on GitHub
☆278Oct 31, 2023Updated 2 years ago
MayDomine / Seq1F1B
View on GitHub
Sequence-level 1F1B schedule for LLMs.
☆19Jun 4, 2024Updated 2 years ago
RUCAIBox / QuantizedEmpirical
View on GitHub
☆15Sep 24, 2023Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
hanruiqian / Awesome-Federated-LLM-Related-Works
View on GitHub
☆17Oct 12, 2023Updated 2 years ago
zepingyu0512 / arithmetic-mechanism
View on GitHub
code for EMNLP 2024 paper: Interpreting Arithmetic Mechanism in Large Language Models through Comparative Neuron Analysis
☆12Nov 17, 2024Updated last year
Linzwcs / AFT
View on GitHub
☆13Jan 22, 2025Updated last year
spcl / fmi
View on GitHub
Function Message Interface (FMI): library for message-passing and collective communication for serverless functions.
☆22Apr 16, 2024Updated 2 years ago
Nathangitlab / Backdoor-Attacks-on-Crowd-Counting
View on GitHub
this is for the ACM MM paper---Backdoor Attack on Crowd Counting
☆17Jul 10, 2022Updated 4 years ago
pjlab-sys4nlp / llama-moe
View on GitHub
⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)
☆1,004Dec 6, 2024Updated last year
hdong920 / GRIFFIN
View on GitHub
☆40Aug 27, 2024Updated last year
GATECH-EIC / PipeGCN
View on GitHub
[ICLR 2022] "PipeGCN: Efficient Full-Graph Training of Graph Convolutional Networks with Pipelined Feature Communication" by Cheng Wan, Y…
☆34Mar 15, 2023Updated 3 years ago
locuslab / scaling_laws_data_filtering
View on GitHub
☆64Apr 9, 2024Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
limenlp / safer-instruct
View on GitHub
This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"
☆17Feb 22, 2024Updated 2 years ago
pkusys / Rummy
View on GitHub
GPU-accelerated vector query processing system that supports large vector datasets beyond GPU memory.
☆41Mar 24, 2024Updated 2 years ago
RUCBM / DelTA
View on GitHub
Code for Paper 'DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards'
☆17May 21, 2026Updated 2 months ago
XueFuzhao / OpenMoE
View on GitHub
A family of open-sourced Mixture-of-Experts (MoE) Large Language Models
☆1,691Mar 8, 2024Updated 2 years ago
ds2-lab / infinistore
View on GitHub
InfiniStore: an elastic serverless cloud storage system (VLDB'23)
☆25May 5, 2023Updated 3 years ago
HKBU-HPML / MG-WFBP
View on GitHub
MG-WFBP: Merging Gradients Wisely for Efficient Communication in Distributed Deep Learning
☆12Apr 26, 2021Updated 5 years ago
flexflow / flexflow-train
View on GitHub
Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training
☆1,897Jul 1, 2026Updated 3 weeks ago
ielab / llm-qlm
View on GitHub
Open-source Large Language Models are Strong Zero-shot Query Likelihood Models for Document Ranking
☆17Oct 26, 2023Updated 2 years ago
ejnnr / cupbearer
View on GitHub
A library for mechanistic anomaly detection
☆22Jan 9, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
princeton-nlp / ELIZA-Transformer
View on GitHub
[NAACL 2025] Representing Rule-based Chatbots with Transformers
☆23Feb 9, 2025Updated last year
FasterDecoding / SnapKV
View on GitHub
☆325Jul 10, 2025Updated last year
SalesforceAIResearch / GemFilter
View on GitHub
☆84Jun 2, 2026Updated last month
ant-research / M2-Miner
View on GitHub
[ICLR 2026] M2-Miner: Multi-Agent Enhanced MCTS for Mobile GUI Agent Data Mining
☆55Apr 22, 2026Updated 3 months ago
jeongminpark417 / GIDS
View on GitHub
☆43Jun 13, 2025Updated last year
hku-systems / naspipe
View on GitHub
☆14Jan 12, 2022Updated 4 years ago
FittenTech / OpenLLaMA-Chinese
View on GitHub
OpenLLaMA-Chinese, a permissively licensed open source instruction-following models based on OpenLLaMA
☆65Jun 29, 2023Updated 3 years ago