usc-isi / PipeEdgeLinks
PipeEdge: Pipeline Parallelism for Large-Scale Model Inference on Heterogeneous Edge Devices
☆35Updated last year
Alternatives and similar repositories for PipeEdge
Users that are interested in PipeEdge are comparing it to the libraries listed below
Sorting:
- This is a list of awesome edgeAI inference related papers.☆98Updated last year
- ☆40Updated 5 years ago
- ☆100Updated last year
- a deep learning-driven scheduler for elastic training in deep learning clusters☆32Updated 4 years ago
- ☆208Updated last year
- Official Repo for "SplitQuant / LLM-PQ: Resource-Efficient LLM Offline Serving on Heterogeneous GPUs via Phase-Aware Model Partition and …☆34Updated last month
- MobiSys#114☆22Updated 2 years ago
- A curated list of awesome projects and papers for AI on Mobile/IoT/Edge devices. Everything is continuously updating. Welcome contributio…☆43Updated 2 years ago
- PyTorch implementation of the paper: Decomposing Vision Transformers for Collaborative Inference in Edge Devices☆13Updated last year
- ☆78Updated 2 years ago
- Ok-Topk is a scheme for distributed training with sparse gradients. Ok-Topk integrates a novel sparse allreduce algorithm (less than 6k c…☆27Updated 2 years ago
- A Portable C Library for Distributed CNN Inference on IoT Edge Clusters☆83Updated 5 years ago
- Artifacts for our SIGCOMM'22 paper Muri☆43Updated last year
- A Deep Learning Cluster Scheduler☆39Updated 4 years ago
- A Cluster-Wide Model Manager to Accelerate DNN Training via Automated Training Warmup☆35Updated 2 years ago
- ☆38Updated 3 months ago
- Proteus: A High-Throughput Inference-Serving System with Accuracy Scaling☆13Updated last year
- GRACE - GRAdient ComprEssion for distributed deep learning☆140Updated last year
- ☆16Updated 2 years ago
- HeliosArtifact☆21Updated 3 years ago
- ☆51Updated 2 years ago
- Artifacts for our ASPLOS'23 paper ElasticFlow☆53Updated last year
- ☆23Updated 3 years ago
- ☆13Updated 5 years ago
- Simple PyTorch graph capturing.☆20Updated 2 years ago
- ☆22Updated 2 years ago
- Deep Compressive Offloading: Speeding Up Neural Network Inference by Trading Edge Computation for Network Latency☆28Updated 4 years ago
- A curated list of research in System for Edge Intelligence and Computing(Edge MLSys), including Frameworks, Tools, Repository, etc. Paper…☆30Updated 3 years ago
- 云边协同- collaborative inference📚Dynamic adaptive DNN surgery for inference acceleration on the edge☆42Updated 2 years ago
- Source code for the paper: "A Latency-Predictable Multi-Dimensional Optimization Framework forDNN-driven Autonomous Systems"☆22Updated 4 years ago