usc-isi / PipeEdge
PipeEdge: Pipeline Parallelism for Large-Scale Model Inference on Heterogeneous Edge Devices
☆25Updated 7 months ago
Related projects: ⓘ
- This is a list of awesome edgeAI inference related papers.☆84Updated 8 months ago
- ☆93Updated 8 months ago
- ☆37Updated 3 years ago
- MobiSys#114☆21Updated last year
- A curated list of early exiting☆24Updated 3 weeks ago
- PyTorch implementation of the paper: Decomposing Vision Transformers for Collaborative Inference in Edge Devices☆9Updated last month
- ☆14Updated last month
- Source code for Jellyfish, a soft real-time inference serving system☆12Updated last year
- Create tiny ML systems for on-device learning.☆20Updated 3 years ago
- A Cluster-Wide Model Manager to Accelerate DNN Training via Automated Training Warmup☆31Updated last year
- InFi is a library for building input filters for resource-efficient inference.☆37Updated 10 months ago
- Official Repo for "LLM-PQ: Serving LLM on Heterogeneous Clusters with Phase-Aware Partition and Adaptive Quantization"☆25Updated 6 months ago
- Ok-Topk is a scheme for distributed training with sparse gradients. Ok-Topk integrates a novel sparse allreduce algorithm (less than 6k c…☆23Updated last year
- The open-source project for "Mandheling: Mixed-Precision On-Device DNN Training with DSP Offloading"[MobiCom'2022]☆18Updated 2 years ago
- a deep learning-driven scheduler for elastic training in deep learning clusters☆27Updated 3 years ago
- Autodidactic Neurosurgeon Collaborative Deep Inference for Mobile Edge Intelligence via Online Learning☆35Updated 3 years ago
- ☆74Updated last year
- LLM serving cluster simulator☆55Updated 4 months ago
- [ICML 2024] Serving LLMs on heterogeneous decentralized clusters.☆14Updated 4 months ago
- Artifacts for our ASPLOS'23 paper ElasticFlow☆51Updated 4 months ago
- ☆16Updated 11 months ago
- HeliosArtifact☆17Updated last year
- Measuring and predicting on-device metrics (latency, power, etc.) of machine learning models☆66Updated last year
- Auto-Split: A General Framework of Collaborative Edge-Cloud AI☆13Updated 3 years ago
- Artifacts for our SIGCOMM'22 paper Muri☆38Updated 8 months ago
- ☆12Updated 4 years ago
- distributed CNN inference at the edge, extend ncnn with CUDA, MPI+OPENMP support.☆18Updated last year
- Source code for the paper: "A Latency-Predictable Multi-Dimensional Optimization Framework forDNN-driven Autonomous Systems"☆19Updated 3 years ago
- 云边协同- collaborative inference📚Dynamic adaptive DNN surgery for inference acceleration on the edge☆26Updated last year
- This project will realize experiments about BranchyNet partitioning using pytorch framework☆28Updated 4 years ago