alpa-projects / tensorflow-alpaLinks
☆20Updated 2 years ago
Alternatives and similar repositories for tensorflow-alpa
Users that are interested in tensorflow-alpa are comparing it to the libraries listed below
Sorting:
- ☆79Updated 2 years ago
- An experimental parallel training platform☆54Updated last year
- An Efficient Pipelined Data Parallel Approach for Training Large Model☆76Updated 4 years ago
- A baseline repository of Auto-Parallelism in Training Neural Networks☆144Updated 3 years ago
- MLIR-based partitioning system☆97Updated this week
- Synthesizer for optimal collective communication algorithms☆108Updated last year
- ☆90Updated 6 months ago
- ☆117Updated last month
- Microsoft Collective Communication Library☆64Updated 7 months ago
- Chimera: bidirectional pipeline parallelism for efficiently training large-scale models.☆67Updated 3 months ago
- nnScaler: Compiling DNN models for Parallel Training☆113Updated last week
- ☆74Updated 4 years ago
- NCCL Profiling Kit☆138Updated 11 months ago
- FTPipe and related pipeline model parallelism research.☆41Updated 2 years ago
- Paella: Low-latency Model Serving with Virtualized GPU Scheduling☆59Updated last year
- Boost hardware utilization for ML training workloads via Inter-model Horizontal Fusion☆32Updated last year
- Experiments and prototypes associated with IREE or MLIR☆50Updated 10 months ago
- ☆9Updated 3 years ago
- Microsoft Collective Communication Library☆350Updated last year
- A home for the final text of all TVM RFCs.☆105Updated 9 months ago
- ☆92Updated 2 years ago
- ☆144Updated 4 months ago
- An extention of TVMScript to write simple and high performance GPU kernels with tensorcore.☆50Updated 11 months ago
- Supplemental materials for The ASPLOS 2025 / EuroSys 2025 Contest on Intra-Operator Parallelism for Distributed Deep Learning☆23Updated last month
- Ultra and Unified CCL☆165Updated this week
- REEF is a GPU-accelerated DNN inference serving system that enables instant kernel preemption and biased concurrent execution in GPU sche…☆94Updated 2 years ago
- ☆98Updated last year
- ☆32Updated 2 years ago
- Shared Middle-Layer for Triton Compilation☆256Updated this week
- PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections☆121Updated 3 years ago