OrderLab / TrainCheck
A Framework for Automated Validation of Deep Learning Training Tasks
☆13Updated this week
Alternatives and similar repositories for TrainCheck:
Users that are interested in TrainCheck are comparing it to the libraries listed below
- Automated Testing and Adaptive Detection of **Slow Faults** in Distributed Systems☆12Updated last month
- Website for Artifact Evaluation at EuroSys, SOSP, OSDI, ATC☆43Updated last week
- Nu is a new datacenter system that enables developers to build fungible applications that can use datacenter resources wherever they are.☆38Updated 11 months ago
- ☆16Updated 9 months ago
- A Progam-Behavior-Guided Far Memory System☆35Updated last year
- OSDI'24 Nomad implementation☆44Updated 5 months ago
- Tiered memory management☆74Updated 7 months ago
- ☆14Updated 9 months ago
- A rust-based benchmark for BlueField SmartNICs.☆28Updated last year
- Tiered Memory Management: Access Latency is the Key!☆49Updated last month
- Orbit: OS Support for Safe and Efficient Auxiliary Tasks in Applications☆20Updated 2 years ago
- A collection of awesome researchers and papers about disaggregated memory.☆153Updated 2 weeks ago
- Hermit: Low-Latency, High-Throughput, and Transparent Remote Memory via Feedback-Directed Asynchrony☆34Updated 11 months ago
- Canvas: Isolated and Adaptive Swapping for Multi-Applications on Remote Memory☆38Updated 2 years ago
- ☆46Updated 6 months ago
- This repository contains a list of papers on various topics (that I am working/worked on) in the system and networking area.☆79Updated 4 months ago
- Artifact evaluation repo for EuroSys'24.☆25Updated last year
- TeRM: Extending RDMA-Attached Memory with SSD [FAST'24]☆42Updated 6 months ago
- Johnny Cache: the End of DRAM Cache Conflicts (in Tiered Main Memory Systems)☆18Updated last year
- ☆57Updated 11 months ago
- Medusa: Accelerating Serverless LLM Inference with Materialization [ASPLOS'25]☆22Updated 4 months ago
- Fastswap, a fast swap system for far memory through RDMA☆80Updated last year
- Expressive, Easy to Build, and High-Performance Application Networks☆16Updated 3 months ago
- Project Mitosis Introduction☆18Updated 2 years ago
- Random collections of my interested research papers / projects☆20Updated 3 years ago
- MIND: In-Network Memory Management for Disaggregated Data Centers☆42Updated 3 years ago
- Source code for "DiLOS: Do Not Trade Compatibility for Performance in Memory Disaggregation (EuroSys'23)"☆18Updated last year
- [USENIX ATC '21] Exploring the Design Space of Page Management for Multi-Tiered Memory Systems☆44Updated 3 years ago
- AIFM: High-Performance, Application-Integrated Far Memory☆120Updated 2 years ago
- Arbitrary offloads for RDMA NICs☆89Updated 3 years ago