yuanxinnn/APTMoE

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/yuanxinnn/APTMoE)

yuanxinnn / APTMoE

☆13

Alternatives and similar repositories for APTMoE

Users that are interested in APTMoE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

SJTU-IPADS / MetaAttention
View on GitHub
MetaAttention: A Unified and Performant Attention Framework Across Hardware Backends(PPoPP'26)
☆16Dec 31, 2025Updated 6 months ago
pku-liang / MAGIS
View on GitHub
MAGIS: Memory Optimization via Coordinated Graph Transformation and Scheduling for DNN (ASPLOS'24)
☆57May 29, 2024Updated 2 years ago
U202142209 / shuati
View on GitHub
这是一个后端使用django框架、前端使用vue 的学生在线答题平台，实现的功能包括使用邮箱通过发送验证码进行登录/注册登录后的用户可以进行答题（单选题）用户可以查看自己的答题记录，对错题本进行管理（添加/移除题目）用户可以查看每道题的答题情况(答题人数、正确率…
☆15Oct 1, 2023Updated 2 years ago
floatingsun / transformer_layers_as_painters
View on GitHub
transformer layers behavior as painters🧑‍🎨
☆15May 6, 2025Updated last year
hyhuang00 / moe_inference
View on GitHub
Code Repository for the NeurIPS 2024 Paper "Toward Efficient Inference for Mixture of Experts".
☆19Oct 30, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
zejia-lin / BulletServe
View on GitHub
Boosting GPU utilization for LLM serving via dynamic spatial-temporal prefill & decode orchestration
☆53Jan 8, 2026Updated 6 months ago
S-Lab-System-Group / HeliosArtifact
View on GitHub
HeliosArtifact
☆22Sep 27, 2022Updated 3 years ago
MLSysU / EcoServe
View on GitHub
[OSDI' 26] Efficient LLM Serving on Commodity GPU Clusters with Data-Reduced Cross-Instance Orchestration
☆23Jul 5, 2026Updated 3 weeks ago
gen-robot / StreamingVLA
View on GitHub
Official repo for "StreamingVLA: Streaming Vision-Language-Action Model with Action Flow Matching and Adaptive Early Observation"
☆30Jun 29, 2026Updated last month
EfficientMoE / MoE-Infinity
View on GitHub
PyTorch library for cost-effective, fast and easy serving of MoE models.
☆327Updated this week
sail-sg / VocabularyParallelism
View on GitHub
Vocabulary Parallelism
☆26Mar 10, 2025Updated last year
tanzelin430 / libsmctrl
View on GitHub
libsmctrl论文的复现，添加了python端接口，可以在python端灵活调用接口来分配计算资源
☆12May 21, 2024Updated 2 years ago
wkcn / Adaptive-Fast-Face-Color-Transfer
View on GitHub
《自适应的快速人脸肤色转移》(Adaptive Fast Face Color Transfer)论文复现
☆26Jul 25, 2017Updated 9 years ago
jetson-nano-wheels / jetson-nano-wheels
View on GitHub
Unofficial wheels for some machine-learning Python libraries, for the Nvidia Jetson Nano.
☆18Aug 24, 2021Updated 4 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Monaco12138 / sr
View on GitHub
☆10Sep 14, 2023Updated 2 years ago
ysyisyourbrother / Galaxy-LM
View on GitHub
Work in progress LLM framework.
☆16Oct 31, 2024Updated last year
illinois-impact / klap
View on GitHub
A source-to-source compiler for optimizing CUDA dynamic parallelism by aggregating launches
☆15Jun 21, 2019Updated 7 years ago
vv314 / sky-piano
View on GitHub
光遇钢琴网页版 https://vv314.github.io/sky-piano/
☆40Jan 4, 2023Updated 3 years ago
ZhangJiaQiao / 2020-DBMS-project
View on GitHub
This is the final project of 2020 DBMS course in SYSU
☆10Jun 23, 2020Updated 6 years ago
wu-kan / wuk_cupti_wrapper
View on GitHub
a simple API to use CUPTI
☆10Aug 19, 2025Updated 11 months ago
vuhpdc / jellyfish
View on GitHub
Source code for Jellyfish, a soft real-time inference serving system
☆15Dec 20, 2022Updated 3 years ago
vineeths96 / Gradient-Compression
View on GitHub
We present a set of all-reduce compatible gradient compression algorithms which significantly reduce the communication overhead while mai…
☆10Nov 14, 2021Updated 4 years ago
bargees / barge-xhyve
View on GitHub
Barge running on xhyve hypervisor
☆15Jun 7, 2022Updated 4 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
MachineLearningSystem / 25ASPLOS-Medusa
View on GitHub
Medusa: Accelerating Serverless LLM Inference with Materialization [ASPLOS'25]
☆12Nov 8, 2024Updated last year
tile-ai / tvm
View on GitHub
Open deep learning compiler stack for cpu, gpu and specialized accelerators
☆20Updated this week
iis-eth-zurich / hd_dvs
View on GitHub
Integrating Event-based Dynamic Vision Sensors with Sparse Hyperdimensional Computing
☆13Jul 9, 2020Updated 6 years ago
cakeng / ASPEN
View on GitHub
This is the proof-of-concept CPU implementation of ASPEN used for the NeurIPS'23 paper ASPEN: Breaking Operator Barriers for Efficient Pa…
☆13Apr 4, 2024Updated 2 years ago
e2ebridge / saml2-proxy
View on GitHub
SAML2 authenticaticating proxy
☆10Jul 28, 2014Updated 12 years ago
tyler-griggs / melange-release
View on GitHub
☆48Jun 27, 2024Updated 2 years ago
aisoft9 / JYCache
View on GitHub
DRAM/SSD hybrid caching system
☆15Mar 13, 2025Updated last year
hiddenlayer2020 / ML-Job-Scheduler-MLFS
View on GitHub
☆13Dec 18, 2020Updated 5 years ago
LeiWang1999 / TVM.CMakeExtend
View on GitHub
Tutorials of Extending and importing TVM with CMAKE Include dependency.
☆16Oct 11, 2024Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
nukomeet / gapcio
View on GitHub
Manage your SSH keys with ease using Github
☆12Jun 28, 2015Updated 11 years ago
awslabs / Lancet-Accelerating-MoE-Training-via-Whole-Graph-Computation-Communication-Overlapping
View on GitHub
Official implementation for the paper Lancet: Accelerating Mixture-of-Experts Training via Whole Graph Computation-Communication Overlapp…
☆14May 20, 2026Updated 2 months ago
SNU-ARC / DecDEC
View on GitHub
[OSDI 2025] DecDEC: A Systems Approach to Advancing Low‑Bit LLM Quantization
☆26Jan 29, 2026Updated 6 months ago
gudiandian / ElasticFlow
View on GitHub
☆17May 10, 2024Updated 2 years ago
lookwei / COMP4423
View on GitHub
Course materials for COMP 4423 - Computer Vision for Beginners at the Hong Kong Polytechnic University
☆35Nov 20, 2023Updated 2 years ago
vertical-knowledge / flask-ripozo
View on GitHub
A python package for integrating ripozo with Flask
☆14Nov 21, 2016Updated 9 years ago
WalkerWorldPeace / DOGE
View on GitHub
Official implementation of "Modeling Multi-Task Model Merging as Adaptive Projective Gradient Descent".
☆23May 23, 2025Updated last year