usc-isi / PipeEdgeLinks
PipeEdge: Pipeline Parallelism for Large-Scale Model Inference on Heterogeneous Edge Devices
☆37Updated last year
Alternatives and similar repositories for PipeEdge
Users that are interested in PipeEdge are comparing it to the libraries listed below
Sorting:
- ☆213Updated last year
- This is a list of awesome edgeAI inference related papers.☆98Updated 2 years ago
- a deep learning-driven scheduler for elastic training in deep learning clusters☆31Updated 4 years ago
- ☆102Updated last year
- ☆13Updated 5 years ago
- Official Repo for "SplitQuant / LLM-PQ: Resource-Efficient LLM Offline Serving on Heterogeneous GPUs via Phase-Aware Model Partition and …☆36Updated 4 months ago
- ☆41Updated 5 years ago
- A Portable C Library for Distributed CNN Inference on IoT Edge Clusters☆88Updated 5 years ago
- InFi is a library for building input filters for resource-efficient inference.☆41Updated 2 years ago
- MobiSys#114☆22Updated 2 years ago
- ☆78Updated 2 years ago
- A curated list of awesome projects and papers for AI on Mobile/IoT/Edge devices. Everything is continuously updating. Welcome contributio…☆44Updated 2 years ago
- Autodidactic Neurosurgeon Collaborative Deep Inference for Mobile Edge Intelligence via Online Learning☆42Updated 4 years ago
- PyTorch implementation of the paper: Decomposing Vision Transformers for Collaborative Inference in Edge Devices☆17Updated last year
- ☆26Updated last year
- [DAC 2024] EDGE-LLM: Enabling Efficient Large Language Model Adaptation on Edge Devices via Layerwise Unified Compression and Adaptive La…☆74Updated last year
- Deep Compressive Offloading: Speeding Up Neural Network Inference by Trading Edge Computation for Network Latency☆28Updated 4 years ago
- A Cluster-Wide Model Manager to Accelerate DNN Training via Automated Training Warmup☆35Updated 3 years ago
- 云边协同- collaborative inference📚Dynamic adaptive DNN surgery for inference acceleration on the edge☆44Updated 2 years ago
- Simple PyTorch graph capturing.☆21Updated 2 years ago
- [IEEE Access] "Head Network Distillation: Splitting Distilled Deep Neural Networks for Resource-constrained Edge Computing Systems" and […☆36Updated 2 years ago
- iGniter, an interference-aware GPU resource provisioning framework for achieving predictable performance of DNN inference in the cloud.☆39Updated last year
- [ICML 2024] Serving LLMs on heterogeneous decentralized clusters.☆34Updated last year
- LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale☆170Updated 5 months ago
- A curated list of research in System for Edge Intelligence and Computing(Edge MLSys), including Frameworks, Tools, Repository, etc. Paper…☆32Updated 4 years ago
- GRACE - GRAdient ComprEssion for distributed deep learning☆139Updated last year
- Proteus: A High-Throughput Inference-Serving System with Accuracy Scaling☆12Updated last year
- LLM serving cluster simulator☆132Updated last year
- This is the proof-of-concept CPU implementation of ASPEN used for the NeurIPS'23 paper ASPEN: Breaking Operator Barriers for Efficient Pa…☆13Updated last year
- Source code and datasets for Ekya, a system for continuous learning on the edge.☆112Updated 3 years ago