SHADE: Enable Fundamental Cacheability for Distributed Deep Learning Training
☆36Mar 1, 2023Updated 3 years ago
Alternatives and similar repositories for SHADE
Users that are interested in SHADE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆38Jan 15, 2021Updated 5 years ago
- ☆24Oct 31, 2023Updated 2 years ago
- ☆23Jun 21, 2023Updated 2 years ago
- [ICDCS 2023] DeAR: Accelerating Distributed Deep Learning with Fine-Grained All-Reduce Pipelining☆12Dec 4, 2023Updated 2 years ago
- Near-optimal Prefetching System☆33Nov 17, 2021Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Artifacts of VLDB'22 paper "COMET: A Novel Memory-Efficient Deep Learning TrainingFramework by Using Error-Bounded Lossy Compression"☆10Aug 2, 2022Updated 3 years ago
- ☆22Aug 13, 2024Updated last year
- Ginex: SSD-enabled Billion-scale Graph Neural Network Training on a Single Machine via Provably Optimal In-memory Caching☆43Jul 10, 2024Updated last year
- ☆13May 9, 2023Updated 3 years ago
- [MSST '24] SAS-Cache: A Semantic-Aware Secondary Cache for LSM-based Key-Value Stores☆14Jun 3, 2024Updated 2 years ago
- ☆31May 31, 2023Updated 3 years ago
- ☆54Dec 13, 2022Updated 3 years ago
- HDF5 Cache VOL connector for caching data on fast storage layers and moving data asynchronously to the parallel file system to hide I/O o…☆22Feb 10, 2026Updated 4 months ago
- Adaptive Message Quantization and Parallelization for Distributed Full-graph GNN Training☆24Mar 1, 2024Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆31Feb 22, 2024Updated 2 years ago
- STREAMer: Benchmarking remote volatile and non-volatile memory bandwidth☆18Aug 21, 2023Updated 2 years ago
- A memcomparable serialization format.☆24May 16, 2023Updated 3 years ago
- Hi-Speed DNN Training with Espresso: Unleashing the Full Potential of Gradient Compression with Near-Optimal Usage Strategies (EuroSys '2…☆15Sep 21, 2023Updated 2 years ago
- ☆13Apr 7, 2025Updated last year
- "JABAS: Joint Adaptive Batching and Automatic Scaling for DNN Training on Heterogeneous GPUs" (EuroSys '25)☆16Apr 7, 2025Updated last year
- [MSST '24] Prophet: Optimizing LSM-Based Key-Value Store on ZNS SSDs with File Lifetime Prediction and Compaction Compensation.☆15Apr 20, 2024Updated 2 years ago
- SCARIF is a tool to estimate the embodied carbon emissions of data center servers with accelerator hardware (GPUs, FPGAs, etc.)☆15Updated this week
- The design and algorithms used in LeCaR are described in this USENIX HotStorage'18 paper and talk slides: https://www.usenix.org/conferen…☆28Jun 4, 2020Updated 6 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- 一门公开课《MIT6.824》的大作业☆12Jun 21, 2021Updated 4 years ago
- Herald: Accelerating Neural Recommendation Training with Embedding Scheduling (NSDI 2024)☆23May 9, 2024Updated 2 years ago
- ☆19Jun 3, 2023Updated 3 years ago
- DISB is a new DNN inference serving benchmark with diverse workloads and models, as well as real-world traces.☆58Aug 21, 2024Updated last year
- ☆121Apr 23, 2026Updated last month
- Artifact for OSDI'23: MGG: Accelerating Graph Neural Networks with Fine-grained intra-kernel Communication-Computation Pipelining on Mult…☆40Mar 17, 2024Updated 2 years ago
- GeminiFS: A Companion File System for GPUs☆82Feb 18, 2025Updated last year
- Official repository for the paper DynaPipe: Optimizing Multi-task Training through Dynamic Pipelines☆19Dec 8, 2023Updated 2 years ago
- Persistent Memory Test Suite☆14Apr 29, 2020Updated 6 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Artifact for "Shockwave: Fair and Efficient Cluster Scheduling for Dynamic Adaptation in Machine Learning" [NSDI '23]☆46Nov 24, 2022Updated 3 years ago
- [ICLR 2022] "PipeGCN: Efficient Full-Graph Training of Graph Convolutional Networks with Pipelined Feature Communication" by Cheng Wan, Y…☆34Mar 15, 2023Updated 3 years ago
- OpenEmbedding is an open source framework for Tensorflow distributed training acceleration.☆33Apr 13, 2023Updated 3 years ago
- An implementation of online data mixing for the Pile dataset, based on the GPT-NeoX library.☆14Jan 9, 2024Updated 2 years ago
- Benchmark implementation of CosmoFlow in TensorFlow Keras☆22Feb 7, 2024Updated 2 years ago
- ☆12Mar 26, 2024Updated 2 years ago
- A distributed in-memory store for temporal knowledge graphs☆10Mar 20, 2024Updated 2 years ago