Oneflow-Inc / dfccl
☆20Updated last month
Alternatives and similar repositories for dfccl:
Users that are interested in dfccl are comparing it to the libraries listed below
- Thunder Research Group's Collective Communication Library☆34Updated 10 months ago
- ☆75Updated 2 years ago
- ⚡️Write HGEMM from scratch using Tensor Cores with WMMA, MMA and CuTe API, Achieve Peak⚡️ Performance.☆59Updated 2 weeks ago
- ☆91Updated 11 months ago
- ☆87Updated 2 weeks ago
- TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing. By pro…☆68Updated this week
- Supplemental materials for The ASPLOS 2025 / EuroSys 2025 Contest on Intra-Operator Parallelism for Distributed Deep Learning☆23Updated 3 months ago
- Paella: Low-latency Model Serving with Virtualized GPU Scheduling☆58Updated 10 months ago
- REEF is a GPU-accelerated DNN inference serving system that enables instant kernel preemption and biased concurrent execution in GPU sche…☆93Updated 2 years ago
- Microsoft Collective Communication Library☆60Updated 4 months ago
- Artifacts of EVT ASPLOS'24☆24Updated last year
- NCCL Profiling Kit☆127Updated 8 months ago
- FlexFlow Serve: Low-Latency, High-Performance LLM Serving☆29Updated this week
- ☆36Updated 3 months ago
- High performance Transformer implementation in C++.☆109Updated 2 months ago
- Tacker: Tensor-CUDA Core Kernel Fusion for Improving the GPU Utilization while Ensuring QoS☆20Updated last month
- Several optimization methods of half-precision general matrix vector multiplication (HGEMV) using CUDA core.☆57Updated 6 months ago
- We invite you to visit and follow our new repository at https://github.com/microsoft/TileFusion. TiledCUDA is a highly efficient kernel …☆178Updated last month
- ☆9Updated last year
- ☆88Updated 6 months ago
- ☆19Updated 5 months ago
- ☆25Updated 11 months ago
- ☆39Updated 5 years ago
- ThrillerFlow is a Dataflow Analysis and Codegen Framework written in Rust.☆14Updated 4 months ago
- A hierarchical collective communications library with portable optimizations☆32Updated 3 months ago
- ☆55Updated 2 months ago
- DISB is a new DNN inference serving benchmark with diverse workloads and models, as well as real-world traces.☆53Updated 7 months ago
- ☆23Updated 2 years ago
- Ultra | Ultimate | Unified CCL☆50Updated last month