openxla / triton
Fork of Triton repository for OpenXLA uses of the Triton language and compiler
☆11Updated this week
Alternatives and similar repositories for triton:
Users that are interested in triton are comparing it to the libraries listed below
- Explore training for quantized models☆12Updated last week
- ☆13Updated this week
- SynapseAI Core is a reference implementation of the SynapseAI API running on Habana Gaudi☆38Updated 2 years ago
- asynchronous/distributed speculative evaluation for llama3☆37Updated 5 months ago
- A server powering LAION's effort to filter CommonCrawl with CLIP, building a large scale image-text dataset.☆13Updated last year
- Main Repo for the OpenHW Group Software Task Group☆15Updated this week
- LLVM-Canon aims to transform LLVM modules into a canonical form by reordering and renaming instructions while preserving the same semanti…☆14Updated 8 months ago
- int8_t and int16_t matrix multiply based on https://arxiv.org/abs/1705.01991☆67Updated last year
- JAX implementations of RWKV☆19Updated last year
- ☆55Updated this week
- ☆18Updated 3 months ago
- Generate python ctypes classes from C headers. Requires LLVM clang☆15Updated 5 months ago
- A user-friendly tool chain that enables the seamless execution of ONNX models using JAX as the backend.☆104Updated last week
- Course Project for COMP4471 on RWKV☆16Updated 11 months ago
- A client library for LAION's effort to filter CommonCrawl with CLIP, building a large scale image-text dataset.☆32Updated last year
- Hutter Prize Submission☆12Updated 3 years ago
- ☆46Updated 10 months ago
- General purpose GPU compute framework built on Vulkan to support 1000s of cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). …☆43Updated 3 months ago
- minimal C implementation of speculative decoding based on llama2.c☆18Updated 6 months ago
- ☆26Updated last year
- Repository of model demos using TT-Buda☆60Updated last month
- ☆50Updated 3 weeks ago
- A fork of OpenBLAS with Armv8-A SVE (Scalable Vector Extension) support☆14Updated 4 years ago
- Fast and vectorizable algorithms for searching in a vector of sorted floating point numbers☆127Updated 3 weeks ago
- tenstorrent kernel from twitch☆27Updated 10 months ago
- ☆9Updated last year
- ☆53Updated 7 months ago
- tinygrad port of the RWKV large language model.☆44Updated 7 months ago
- Experiments with BitNet inference on CPU☆52Updated 9 months ago