A lightweight triton-based General Matrix Multiplication (GEMM) library.
☆64May 14, 2026Updated last month
Alternatives and similar repositories for tritonBLAS
Users that are interested in tritonBLAS are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A Triton-only attention backend for vLLM☆25Mar 17, 2026Updated 2 months ago
- A Triton JIT runtime and ffi provider in C++☆35May 27, 2026Updated 2 weeks ago
- LLVM/MLIR based compiler instrumentation of AMD GPU kernels☆21Jul 13, 2025Updated 11 months ago
- Header-only library of GPU-accelerated, concurrent data structures.☆12Nov 11, 2025Updated 7 months ago
- A collection of papers about physically-based animation for deformable bodies☆13Feb 16, 2022Updated 4 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- HIP backend patch for Numba, the NumPy aware dynamic Python compiler using LLVM.☆21May 11, 2026Updated last month
- Specification and description of the MathOptFormat file format☆19Sep 28, 2023Updated 2 years ago
- A dynamic GPU memory allocator, suitable for warp synchronized scenarios.☆11Aug 20, 2019Updated 6 years ago
- C++ Library of the Linear Conjugate Gradient Methods (LibLCG)☆11Aug 23, 2022Updated 3 years ago
- A lightweight, general-purpose framework for evaluating GPU kernel and benchmark.☆53Updated this week
- ☆12Mar 14, 2024Updated 2 years ago
- Open-source library for Graph Streaming. Solves the connected components problem using sub-linear space. Published in SIGMOD'22.☆11Apr 6, 2026Updated 2 months ago
- A reference implementation of std::simd, providing data parallel types in the C++ standard☆14Mar 9, 2020Updated 6 years ago
- ☆13Nov 25, 2021Updated 4 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Implementation of various equivariant models in JAX☆19Apr 12, 2024Updated 2 years ago
- ☆27May 18, 2025Updated last year
- PaiNN in jax☆11Jan 14, 2025Updated last year
- Hierarchical Loss function☆13May 6, 2019Updated 7 years ago
- Docker image for☆11Dec 25, 2017Updated 8 years ago
- Matching algorithms for LightGraphs.jl☆13Oct 21, 2021Updated 4 years ago
- ☆11Sep 19, 2024Updated last year
- ☆10Sep 16, 2020Updated 5 years ago
- Collection of open source OpenGL demos, graphics prototypes and physics sims.☆15May 29, 2021Updated 5 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Benchmarking scripts for Gaia☆15Apr 10, 2025Updated last year
- SapienIPC experimental release. This release is temporary and will not be maintained. We will release a stable version soon.☆21Dec 8, 2025Updated 6 months ago
- Tensor Parallelism with JAX + Shard Map☆11Sep 29, 2023Updated 2 years ago
- ☆17Mar 26, 2025Updated last year
- E(n) Equivariant GNN in jax☆14Aug 31, 2023Updated 2 years ago
- A 3D mass-spring real world simulator with more types of forces(gravity, electricity, spring, collision, ...)☆19Sep 4, 2020Updated 5 years ago
- Scale-out system monitoring☆24Jun 5, 2026Updated last week
- In this folder my all Python codes are stored.☆17May 2, 2021Updated 5 years ago
- Incomplete-Cholesky preconditioned conjugate gradient algorithm implemented with cuBLAS/cuSPARSE☆12Jun 24, 2022Updated 3 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Automated bottleneck detection and solution orchestration☆21Feb 24, 2026Updated 3 months ago
- 🕹 Demos of Games 103 (Physics Simulation) Using Unity☆18Jan 16, 2022Updated 4 years ago
- ☆10Dec 29, 2020Updated 5 years ago
- [DEPRECATED] Moved to ROCm/rocm-systems repo☆146Updated this week
- A fast alternative to the standard C/C++ pow() function. With adjustable accuracy-space tradeoff.☆14Jul 12, 2013Updated 12 years ago
- UCAS网络登录☆13Nov 17, 2018Updated 7 years ago
- ☆10Apr 24, 2023Updated 3 years ago