UofT-EcoSystem / BPPSA-openView external linksLinks
The (open-source part of) code to reproduce "BPPSA: Scaling Back-propagation by Parallel Scan Algorithm".
☆13Jun 7, 2021Updated 4 years ago
Alternatives and similar repositories for BPPSA-open
Users that are interested in BPPSA-open are comparing it to the libraries listed below
Sorting:
- Switch-based Training Acceleration for Machine Learning (SwitchML)☆16Apr 13, 2021Updated 4 years ago
- ☆14Nov 7, 2025Updated 3 months ago
- A PyTorch wrapper of parallel exclusive scan in CUDA☆12May 25, 2023Updated 2 years ago
- ☆24May 6, 2022Updated 3 years ago
- ☆27Mar 2, 2023Updated 2 years ago
- Community maintained hardware plugin for vLLM on AWS Neuron☆21Feb 3, 2026Updated last week
- This repository corresponds to the PICCO compiler for secure multi-party computation published in 2013 with more recent efficiency improv…☆12Nov 13, 2025Updated 3 months ago
- Ok-Topk is a scheme for distributed training with sparse gradients. Ok-Topk integrates a novel sparse allreduce algorithm (less than 6k c…☆27Dec 10, 2022Updated 3 years ago
- Cocytus is an efficient and available in-memory K/V-store through hybrid erasure coding and replication☆30Mar 7, 2016Updated 9 years ago
- A rust-based benchmark for BlueField SmartNICs.☆30Jul 5, 2023Updated 2 years ago
- Code released to accompany the ISCA paper: "T4: Compiling Sequential Code for Effective Speculative Parallelization in Hardware"☆28Feb 18, 2022Updated 3 years ago
- Johann, the lightweight and flexible scenario orchestrator☆12Oct 3, 2022Updated 3 years ago
- 供大学生,竞赛生,高中生查找的math-wiki☆10May 26, 2022Updated 3 years ago
- LITS: An Optimized Learned Index for Strings☆13Jun 18, 2025Updated 7 months ago
- netbeacon - monitoring your network capture, NIDS or network analysis process☆19Oct 26, 2013Updated 12 years ago
- ☆10Jun 28, 2025Updated 7 months ago
- Code accompanying the NeurIPS 2019 paper AutoAssist: A Framework to Accelerate Training of Deep Neural Networks.☆14Oct 3, 2022Updated 3 years ago
- [ICDCS 2023] Evaluation and Optimization of Gradient Compression for Distributed Deep Learning☆10Apr 28, 2023Updated 2 years ago
- ☆44Nov 15, 2021Updated 4 years ago
- ☆15Jul 18, 2023Updated 2 years ago
- ☆12May 18, 2024Updated last year
- ☆13Jan 21, 2022Updated 4 years ago
- ☆10Feb 20, 2021Updated 4 years ago
- A set of platform agnostic to measure the performance of various BPF helper functions☆10Sep 11, 2023Updated 2 years ago
- Disco Stochastic Network Calculator☆10Aug 15, 2017Updated 8 years ago
- How to plot for papers, slides, demos, etc.☆10Apr 7, 2022Updated 3 years ago
- ☆11Oct 11, 2023Updated 2 years ago
- Jieba 0.39 的 Java 复刻版,支持原版 Jieba 的所有核心功能☆12Feb 14, 2019Updated 7 years ago
- Kernel Module that implements Paxos protocol☆11Oct 23, 2020Updated 5 years ago
- A JIT compiler implemented with MLIR/LLVM for faster query processing in SQLite☆18Jan 3, 2023Updated 3 years ago
- ☆11Mar 13, 2023Updated 2 years ago
- ☆11Apr 3, 2023Updated 2 years ago
- A Coq framework to support structural design and proof of hardware cache-coherence protocols☆14May 7, 2022Updated 3 years ago
- NVMesh Container Storage Interface (CSI) Driver for Kubernetes☆11Oct 7, 2024Updated last year
- ☆11Sep 22, 2017Updated 8 years ago
- SixArm.com » Brew install scripts for our various packages☆12Apr 14, 2025Updated 9 months ago
- Beep the PC speaker☆11Nov 9, 2022Updated 3 years ago
- sgx-based encrypted deduplication prototype☆14May 14, 2021Updated 4 years ago
- Busy Beaver deciders backed by Coq proof☆13Jan 29, 2026Updated 2 weeks ago