usc-isi / PipeEdge
PipeEdge: Pipeline Parallelism for Large-Scale Model Inference on Heterogeneous Edge Devices
☆29Updated last year
Alternatives and similar repositories for PipeEdge:
Users that are interested in PipeEdge are comparing it to the libraries listed below
- ☆40Updated 4 years ago
- a deep learning-driven scheduler for elastic training in deep learning clusters☆29Updated 4 years ago
- This is a list of awesome edgeAI inference related papers.☆95Updated last year
- ☆13Updated 5 years ago
- 云边协同- collaborative inference📚Dynamic adaptive DNN surgery for inference acceleration on the edge☆37Updated last year
- A Cluster-Wide Model Manager to Accelerate DNN Training via Automated Training Warmup☆34Updated 2 years ago
- ☆9Updated last year
- ☆22Updated last year
- Official Repo for "LLM-PQ: Serving LLM on Heterogeneous Clusters with Phase-Aware Partition and Adaptive Quantization"☆29Updated last year
- ☆14Updated 8 months ago
- HeliosArtifact☆20Updated 2 years ago
- A Deep Learning Cluster Scheduler☆37Updated 4 years ago
- ☆99Updated last year
- Autodidactic Neurosurgeon Collaborative Deep Inference for Mobile Edge Intelligence via Online Learning☆41Updated 3 years ago
- Lucid: A Non-Intrusive, Scalable and Interpretable Scheduler for Deep Learning Training Jobs☆53Updated last year
- PyTorch implementation of the paper: Decomposing Vision Transformers for Collaborative Inference in Edge Devices☆12Updated 8 months ago
- PyTorch implementation of the paper: Multi-Agent Collaborative Inference via DNN Decoupling: Intermediate Feature Compression and Edge Le…☆39Updated last year
- ☆49Updated 2 years ago
- LLM serving cluster simulator☆96Updated 11 months ago
- ☆20Updated 3 years ago
- A PyTorch Implementation for experiements in paper: Neurosurgeon: Collaborative Intelligence Between the Cloud and Mobile Edge.☆13Updated last year
- INFOCOM 2024: Online Resource Allocation for Edge Intelligence with Colocated Model Retraining and Inference☆14Updated 6 months ago
- A curated list of early exiting (LLM, CV, NLP, etc)☆49Updated 7 months ago
- Artifacts for our ASPLOS'23 paper ElasticFlow☆51Updated 11 months ago
- ☆11Updated 4 years ago
- [ICML 2024] Serving LLMs on heterogeneous decentralized clusters.☆24Updated 11 months ago
- ☆41Updated 9 months ago
- ☆37Updated 3 years ago
- Ok-Topk is a scheme for distributed training with sparse gradients. Ok-Topk integrates a novel sparse allreduce algorithm (less than 6k c…☆25Updated 2 years ago
- InFi is a library for building input filters for resource-efficient inference.☆37Updated last year