howardlau1999 / autogradLinks
A simple demonstration of how PyTorch autograd works
☆16Updated 3 years ago
Alternatives and similar repositories for autograd
Users that are interested in autograd are comparing it to the libraries listed below
Sorting:
- ☆21Updated this week
- High performance RDMA-based distributed feature collection component for training GNN model on EXTREMELY large graph☆55Updated 3 years ago
- An efficient concurrent graph processing system☆46Updated 3 years ago
- PSTensor provides a way to hack the memory management of tensors in TensorFlow and PyTorch by defining your own C++ Tensor Class.☆10Updated 3 years ago
- C++ interfaces for RDMA access☆78Updated this week
- Rebuild YatSenOS On RISC-V 64.☆20Updated 3 years ago
- A GPU-accelerated DNN inference serving system that supports instant kernel preemption and biased concurrent execution in GPU scheduling.☆43Updated 3 years ago
- My paper/code reading notes in Chinese☆46Updated 2 months ago
- ☆27Updated 6 months ago
- Codes for MO's Trading☆15Updated 3 years ago
- system paper reading notes☆246Updated 3 years ago
- ☆22Updated 6 years ago
- Source code for the FAST '23 paper “MadFS: Per-File Virtualization for Userspace Persistent Memory Filesystems”☆43Updated 2 years ago
- 2022 ECS CloudBuild Distributed Cache Contest - Final Round https://tianchi.aliyun.com/competition/entrance/531982/introduction☆17Updated 2 years ago
- General system research material (not limited to paper) reading notes.☆22Updated 4 years ago
- ☆56Updated 4 years ago
- ☆36Updated last year
- ☆35Updated 3 years ago
- SocksDirect code repository☆19Updated 3 years ago
- [OSDI 2024] Motor: Enabling Multi-Versioning for Distributed Transactions on Disaggregated Memory☆50Updated last year
- DISB is a new DNN inference serving benchmark with diverse workloads and models, as well as real-world traces.☆53Updated last year
- ☆15Updated 3 years ago
- A high-performance file system for multicore CPUs and flash storage☆33Updated 2 years ago
- A User-Transparent Block Cache Enabling High-Performance Out-of-Core Processing with In-Memory Programs☆74Updated 2 years ago
- A memory profiler for NVIDIA GPUs to explore memory inefficiencies in GPU-accelerated applications.☆25Updated 10 months ago
- DeepXTrace is a lightweight tool for precisely diagnosing slow ranks in DeepEP-based environments.☆47Updated last week
- Experimental KV store engine on non-volatile memory☆72Updated 4 years ago
- Vector search with bounded performance.☆36Updated last year
- BytePS examples (Vision, NLP, GAN, etc)☆19Updated 2 years ago
- Repo for OSDI 2023 paper: "Ship your Critical Section Not Your Data: Enabling Transparent Delegation with TCLocks"☆21Updated 10 months ago