usc-isi / PipeEdgeLinks
PipeEdge: Pipeline Parallelism for Large-Scale Model Inference on Heterogeneous Edge Devices
☆35Updated last year
Alternatives and similar repositories for PipeEdge
Users that are interested in PipeEdge are comparing it to the libraries listed below
Sorting:
- This is a list of awesome edgeAI inference related papers.☆95Updated last year
- ☆40Updated 4 years ago
- a deep learning-driven scheduler for elastic training in deep learning clusters☆30Updated 4 years ago
- Official Repo for "LLM-PQ: Serving LLM on Heterogeneous Clusters with Phase-Aware Partition and Adaptive Quantization"☆34Updated last year
- 云边协同- collaborative inference📚Dynamic adaptive DNN surgery for inference acceleration on the edge☆40Updated last year
- ☆13Updated 5 years ago
- A Cluster-Wide Model Manager to Accelerate DNN Training via Automated Training Warmup☆35Updated 2 years ago
- Ok-Topk is a scheme for distributed training with sparse gradients. Ok-Topk integrates a novel sparse allreduce algorithm (less than 6k c…☆26Updated 2 years ago
- ☆14Updated 10 months ago
- MobiSys#114☆21Updated last year
- Autodidactic Neurosurgeon Collaborative Deep Inference for Mobile Edge Intelligence via Online Learning☆40Updated 3 years ago
- Lucid: A Non-Intrusive, Scalable and Interpretable Scheduler for Deep Learning Training Jobs☆54Updated 2 years ago
- ☆99Updated last year
- ☆50Updated 2 years ago
- Open-source implementation for "Helix: Serving Large Language Models over Heterogeneous GPUs and Network via Max-Flow"☆49Updated 7 months ago
- Hi-Speed DNN Training with Espresso: Unleashing the Full Potential of Gradient Compression with Near-Optimal Usage Strategies (EuroSys '2…☆15Updated last year
- PyTorch implementation of the paper: Decomposing Vision Transformers for Collaborative Inference in Edge Devices☆12Updated 11 months ago
- ☆22Updated 2 years ago
- ☆202Updated last year
- InFi is a library for building input filters for resource-efficient inference.☆38Updated last year
- Source code for the paper: "A Latency-Predictable Multi-Dimensional Optimization Framework forDNN-driven Autonomous Systems"☆22Updated 4 years ago
- ☆16Updated last year
- Artifacts for our ASPLOS'23 paper ElasticFlow☆52Updated last year
- Artifacts for our SIGCOMM'22 paper Muri☆42Updated last year
- The implementation of paper : RTCoInfer: Real-time Edge-Cloud Collaborative CNN Inference for Stream Analytics on Ubiquitous Images☆14Updated 2 years ago
- [ICML 2024] Serving LLMs on heterogeneous decentralized clusters.☆26Updated last year
- ☆21Updated 3 years ago
- Artifact for "Shockwave: Fair and Efficient Cluster Scheduling for Dynamic Adaptation in Machine Learning" [NSDI '23]☆44Updated 2 years ago
- GRACE - GRAdient ComprEssion for distributed deep learning☆140Updated 11 months ago
- zTT: Learning-based DVFS with Zero Thermal Throttling for Mobile Devices [MobiSys'21] - Artifact Evaluation☆25Updated 4 years ago