vuhpdc / jellyfishLinks
Source code for Jellyfish, a soft real-time inference serving system
☆13Updated 2 years ago
Alternatives and similar repositories for jellyfish
Users that are interested in jellyfish are comparing it to the libraries listed below
Sorting:
- ☆21Updated last year
- ☆56Updated 3 years ago
- ☆14Updated 10 months ago
- Source code and datasets for Ekya, a system for continuous learning on the edge.☆106Updated 3 years ago
- A Cluster-Wide Model Manager to Accelerate DNN Training via Automated Training Warmup☆35Updated 2 years ago
- ☆202Updated last year
- ☆45Updated 2 years ago
- ☆16Updated last year
- a deep learning-driven scheduler for elastic training in deep learning clusters☆30Updated 4 years ago
- This is a list of awesome edgeAI inference related papers.☆95Updated last year
- ☆26Updated 2 years ago
- ☆10Updated 4 years ago
- MobiSys#114☆21Updated last year
- Proteus: A High-Throughput Inference-Serving System with Accuracy Scaling☆13Updated last year
- PipeEdge: Pipeline Parallelism for Large-Scale Model Inference on Heterogeneous Edge Devices☆35Updated last year
- Hi-Speed DNN Training with Espresso: Unleashing the Full Potential of Gradient Compression with Near-Optimal Usage Strategies (EuroSys '2…☆15Updated last year
- BATCH: Adaptive Batching for Efficient MachineLearning Serving on Serverless Platforms☆10Updated 3 years ago
- ☆17Updated last year
- THC: Accelerating Distributed Deep Learning Using Tensor Homomorphic Compression☆19Updated 10 months ago
- Source code of IPA, https://escholarship.org/uc/item/2p0805dq☆10Updated 11 months ago
- We present a set of all-reduce compatible gradient compression algorithms which significantly reduce the communication overhead while mai…☆10Updated 3 years ago
- LaLaRAND: Flexible Layer-by-Layer CPU/GPU Scheduling for Real-Time DNN Tasks☆15Updated 3 years ago
- ☆50Updated 2 years ago
- Metis: Learning to Schedule Long-Running Applications in Shared Container Clusters with at Scale☆18Updated 5 years ago
- Open-source implementation for "Helix: Serving Large Language Models over Heterogeneous GPUs and Network via Max-Flow"☆49Updated 7 months ago
- Primo: Practical Learning-Augmented Systems with Interpretable Models☆19Updated last year
- ☆21Updated 2 years ago
- Official code repository for "CoVA: Exploiting Compressed-Domain Analysis to Accelerate Video Analytics [USENIX ATC 22]"☆16Updated 9 months ago
- ☆99Updated last year
- ☆13Updated last year