MyCaffe / NCCLLinks
Windows version of NVIDIA's NCCL ('Nickel') for multi-GPU training - please use https://github.com/NVIDIA/nccl for changes.
☆61Updated last month
Alternatives and similar repositories for NCCL
Users that are interested in NCCL are comparing it to the libraries listed below
Sorting:
- An easy way to run, test, benchmark and tune OpenCL kernel files☆24Updated 2 years ago
- ☆135Updated 3 weeks ago
- AMD's graph optimization engine.☆272Updated this week
- Header-only safetensors loader and saver in C++☆74Updated 2 weeks ago
- cudnn_frontend provides a c++ wrapper for the cudnn backend API and samples on how to use it☆674Updated 3 weeks ago
- A nvImageCodec library of GPU- and CPU- accelerated codecs featuring a unified interface☆134Updated last week
- A TensorFlow Extension: GPU performance tools for TensorFlow.☆26Updated 2 years ago
- Computation using data flow graphs for scalable machine learning☆68Updated this week
- ONNX Runtime: cross-platform, high performance scoring engine for ML models☆78Updated this week
- Common utilities for ONNX converters☆291Updated last month
- OneFlow->ONNX☆43Updated 2 years ago
- Large Language Model Onnx Inference Framework☆36Updated last month
- C++ implementations for various tokenizers (sentencepiece, tiktoken etc).☆44Updated last month
- A faster implementation of OpenCV-CUDA that uses OpenCV objects, and more!☆54Updated 2 months ago
- The NVIDIA® Tools Extension SDK (NVTX) is a C-based Application Programming Interface (API) for annotating events, code ranges, and resou…☆499Updated last week
- Standalone Flash Attention v2 kernel without libtorch dependency☆113Updated last year
- THIS REPOSITORY HAS MOVED TO github.com/nvidia/cub, WHICH IS AUTOMATICALLY MIRRORED HERE.☆85Updated last year
- ☆78Updated last year
- Development repository for the Triton language and compiler☆140Updated this week
- ☆23Updated 2 years ago
- Fork of https://source.codeaurora.org/quic/hexagon_nn/nnlib☆58Updated 2 years ago
- An unofficial cuda assembler, for all generations of SASS, hopefully :)☆84Updated 2 years ago
- MIVisionX toolkit is a set of comprehensive computer vision and machine intelligence libraries, utilities, and applications bundled into …☆207Updated this week
- oneCCL Bindings for Pytorch* (deprecated)☆104Updated 2 weeks ago
- ☆98Updated 4 years ago
- ☆18Updated 2 years ago
- study of cutlass☆22Updated last year
- Tiny C++ LLM inference implementation from scratch☆97Updated last month
- ☆125Updated 2 years ago
- ☆71Updated 9 months ago