QuickReduce is a performant all-reduce library designed for AMD ROCm that supports inline compression.
☆36Aug 29, 2025Updated 6 months ago
Alternatives and similar repositories for quickreduce
Users that are interested in quickreduce are comparing it to the libraries listed below
Sorting:
- A "standard library" of Triton kernels.☆22Oct 2, 2025Updated 4 months ago
- ☆48Updated this week
- AI Tensor Engine for ROCm☆360Updated this week
- A TensorFlow Extension: GPU performance tools for TensorFlow.☆26Jul 27, 2023Updated 2 years ago
- [DEPRECATED] Moved to ROCm/rocm-systems repo☆26Updated this week
- ☆31Apr 19, 2025Updated 10 months ago
- Large Language Model Text Generation Inference on Habana Gaudi☆34Mar 20, 2025Updated 11 months ago
- Modular RDMA Interface☆84Updated this week
- tokviz is a Python library for visualizing tokenization patterns across different language models.☆12Apr 25, 2024Updated last year
- DeepSeek-V3/R1 inference performance simulator☆179Mar 27, 2025Updated 11 months ago
- NVIDIA Inference Xfer Library (NIXL)☆898Updated this week
- Boosting 4-bit inference kernels with 2:4 Sparsity☆93Sep 4, 2024Updated last year
- Zipkin client for asgi. Compatible with Starlette Framework and Jaeger tracing server☆10Apr 21, 2024Updated last year
- Performance tests for multinode NGC.Ready certification☆15Jan 28, 2026Updated last month
- ☆10Nov 16, 2024Updated last year
- A conda-smithy repository for ctng-compiler-activation.☆14Feb 12, 2026Updated 2 weeks ago
- Live video streaming server built with Sanic framework and opencv-python.☆12Oct 29, 2017Updated 8 years ago
- Runway Port of BigBiGAN from the paper "Large Scale Adversarial Representation Learning"☆10Dec 10, 2024Updated last year
- pip install patchelf. patchelf Python wheel for PyPI.☆11Updated this week
- Runtimex package help to expose Go Runtime internals representation safely.☆13Feb 19, 2025Updated last year
- ☆13Aug 1, 2023Updated 2 years ago
- 基于 Redis 官方分布式锁文章的 Python 实现☆10Jan 16, 2021Updated 5 years ago
- Run-time validation of tensors for machine-learning systems.☆11Apr 8, 2021Updated 4 years ago
- Presentation repository around making an API that retrieves large amounts of geospatial data quickly☆12Mar 7, 2023Updated 2 years ago
- Fork of LLVM Project containing a Colossus IPU backend implementation☆13Feb 2, 2026Updated 3 weeks ago
- Pytest plugin: add multihost framework.☆11Nov 27, 2025Updated 3 months ago
- Terribly incorrect and incomplete AOT compiler for mRuby. Source code for the LLVM Social Berlin #20☆10Aug 25, 2022Updated 3 years ago
- More Quake. Less bullshit.☆12Apr 26, 2015Updated 10 years ago
- ☆18Apr 16, 2025Updated 10 months ago
- The goal of the OSSCI Fleet is to provide a central mechanism to enable test automation, batch job scheduling, and developer access to a …☆13Feb 18, 2026Updated last week
- ☆160Dec 27, 2024Updated last year
- A simple, stdlib only, Go module for generating RFC9562 UUIDs☆17Feb 19, 2026Updated last week
- Haidar's Web Page☆13Oct 9, 2024Updated last year
- Extensions for the TG geometry library☆12Dec 3, 2024Updated last year
- Swift package for reading and writing Safetensors files.☆12Feb 6, 2026Updated 3 weeks ago
- ToolPlanner: A Tool Augmented LLM for Multi Granularity Instructions with Path Planning and Feedback☆18Dec 3, 2024Updated last year
- A Decentralised microblogging website on ethereum blockchain☆14Jun 11, 2018Updated 7 years ago
- A conda-smithy repository for jaxlib.☆17Nov 4, 2025Updated 3 months ago
- Multiprocessing compatible memory leak debugger inspired by dozer/dowser☆14Mar 15, 2023Updated 2 years ago