xpan413/FSMoE

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/xpan413/FSMoE)

xpan413 / FSMoE

☆16

Alternatives and similar repositories for FSMoE

Users that are interested in FSMoE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Adlik / smoothquantplus
View on GitHub
[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
☆23Mar 15, 2024Updated 2 years ago
rachittshah / DSpy-KGs
View on GitHub
LLM-driven automated knowledge graph construction from text using DSPy and Neo4j
☆20Aug 19, 2024Updated last year
S-Lab-System-Group / ChronusArtifact
View on GitHub
☆23Jan 7, 2022Updated 4 years ago
vc-nju / drfi_python
View on GitHub
DRFI For Region Dissection
☆13Jan 11, 2019Updated 7 years ago
msr-fiddle / blox
View on GitHub
☆46Jul 4, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
petuum-inc / poseidon-release
View on GitHub
Release doc/tutorial/wheels for poseidon-tf
☆10Jan 18, 2018Updated 8 years ago
suquark / hoplite
View on GitHub
☆43Sep 6, 2021Updated 4 years ago
pengyanghua / DL2
View on GitHub
a deep learning-driven scheduler for elastic training in deep learning clusters
☆31Jan 14, 2021Updated 5 years ago
MatanHamilis / one_stencil
View on GitHub
Multiple 1-stencil implementations using nvidia cuda.
☆12Dec 2, 2017Updated 8 years ago
robcasloz / llvm-discovery
View on GitHub
Discovery of Structured Parallelism In Sequential and Parallel Code
☆10Feb 13, 2021Updated 5 years ago
JianboGuo / network-decoupling
View on GitHub
Code for the paper: Network Decoupling: From Regular to Depthwise Separable Convolutions
☆13Dec 9, 2018Updated 7 years ago
kdrag0n / velocity_dream
View on GitHub
Velocity Kernel for the Samsung Galaxy S8/S8+ (dreamlte/dream2lte). (discontinued)
☆10May 30, 2019Updated 7 years ago
project-etalon / etalon
View on GitHub
LLM Serving Performance Evaluation Harness
☆84Feb 25, 2025Updated last year
lhb8125 / Megatron-LM
View on GitHub
Ongoing research training transformer models at scale
☆19Jul 9, 2026Updated last week
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
Ageliss / For_shared_codes
View on GitHub
☆15Dec 9, 2018Updated 7 years ago
hellangleZ / Qwen3_autothink_adapter
View on GitHub
Implemented a script that automatically adjusts Qwen3's inference and non-inference capabilities, based on an OpenAI-like API. The infere…
☆22May 9, 2025Updated last year
bytedance / DRL-based-VM-Rescheduling
View on GitHub
This repo contains the implementation of deep reinforcement learning (DRL) algorithms for virtual machine rescheduling in data centers.
☆12Dec 2, 2022Updated 3 years ago
casys-kaist / EnvPipe
View on GitHub
☆27Aug 31, 2023Updated 2 years ago
EfficientLLMSys / MuxServe
View on GitHub
☆15Jun 26, 2024Updated 2 years ago
uillianluiz / RUBiS
View on GitHub
Updated version of the RUBiS benchmark (http://rubis.ow2.org/)
☆12Jun 20, 2017Updated 9 years ago
DS3Lab / AC-SGD
View on GitHub
Code associated with the paper **Fine-tuning Language Models over Slow Networks using Activation Compression with Guarantees**.
☆29Apr 25, 2023Updated 3 years ago
twilligon / git-lfs-client-worker
View on GitHub
Serve large files on Cloudflare Pages directly from Git LFS
☆16Sep 1, 2023Updated 2 years ago
bonniesjli / icm
View on GitHub
Intrinsic Curiosity Module (ICM) + PPO on the Pyramid and PushBlock environment.
☆12Sep 3, 2019Updated 6 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
xuqifan897 / Optimus
View on GitHub
☆28Jul 11, 2021Updated 5 years ago
NetX-lab / Ayo
View on GitHub
[ASPLOS'25] Towards End-to-End Optimization of LLM-based Applications with Ayo
☆75Mar 11, 2026Updated 4 months ago
ppzqh / Energy-Efficient-Algorithms_CloudSim
View on GitHub
☆12Dec 16, 2020Updated 5 years ago
KevinLee1110 / dynamic-batching
View on GitHub
The official repo for the paper "Optimizing LLM Inference Throughput via Memory-aware and SLA-constrained Dynamic Batching"
☆18Mar 17, 2025Updated last year
gogolgrind / PyTorchNMTF
View on GitHub
NMF/NTF with Pytorch
☆17Mar 24, 2019Updated 7 years ago
ogreen / GpuTriangleCounting
View on GitHub
Triangle Counting for the GPU using CUDA.
☆14Nov 5, 2015Updated 10 years ago
yashgyy / Comp_Arch-Resources
View on GitHub
☆23Jul 6, 2026Updated 2 weeks ago
hydra-hoard / hydra
View on GitHub
A decentralised application that creates high quality machine learning datasets
☆12Jan 22, 2019Updated 7 years ago
James-QiuHaoran / LLM-serving-with-proxy-models
View on GitHub
Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction | A tiny BERT model can tell you the verbosity of an …
☆52Jun 1, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
eth-easl / pccheck
View on GitHub
☆12Apr 23, 2026Updated 2 months ago
Sreyan88 / Disfluency-Detection-with-Span-Classification
View on GitHub
This repository contains the implementation of the paper: "Span Classification with Structured Information for Disfluency Detection in Sp…
☆14Jun 6, 2023Updated 3 years ago
longxianlei / G-ResNeXt_GroupNet
View on GitHub
re-implement of Group ConvNet, also be called as G-ResNext. It's from the paper, reproduction of the paper "Differentiable Learning-to-Gr…
☆21Dec 4, 2019Updated 6 years ago
S-Lab-System-Group / Lucid
View on GitHub
Lucid: A Non-Intrusive, Scalable and Interpretable Scheduler for Deep Learning Training Jobs
☆61May 21, 2023Updated 3 years ago
Montimage / maip
View on GitHub
A platform that provides users with easy access to AI services developed by Montimage and usage of explainable AI techniques (e.g., LIME,…
☆10Feb 17, 2026Updated 5 months ago
gty111 / GEMM_WMMA
View on GitHub
GEMM by WMMA (tensor core)
☆15Jul 31, 2022Updated 3 years ago
shuoshuc / FabricEval
View on GitHub
An evaluation framework for data center traffic engineering.
☆14Jul 28, 2024Updated last year