causalflow-ai / petit-kernelView external linksLinks
Optimized FP16/BF16 x FP4 GPU kernels for AMD GPUs
☆39Feb 6, 2026Updated last week
Alternatives and similar repositories for petit-kernel
Users that are interested in petit-kernel are comparing it to the libraries listed below
Sorting:
- Reranking for Multi-objective Optimized Recommender Systems☆11Aug 3, 2023Updated 2 years ago
- Low-latency live streaming PoC☆11Jul 30, 2019Updated 6 years ago
- ☆14Dec 1, 2020Updated 5 years ago
- libsmctrl论文的复现,添加了python端接口,可以在python端灵活调用接口来分配计算资源☆12May 21, 2024Updated last year
- Keyphrase Extraction from Scholarly Documents - Thesis☆14Nov 3, 2021Updated 4 years ago
- 经过强化的goose3通用网页提取器(添加作者VX: 862187570 , Python交流学习)☆16Nov 18, 2021Updated 4 years ago
- a simple API to use CUPTI☆11Aug 19, 2025Updated 5 months ago
- ☆13Jan 14, 2026Updated last month
- Code and pruned models for our paper: K. Gkrispanis, N. Gkalelis, V. Mezaris, "Filter-Pruning of Lightweight Face Detectors Using a Geome…☆14May 8, 2024Updated last year
- Expected edit distance implementation using OpenFst tools☆11May 13, 2015Updated 10 years ago
- ☆11May 18, 2025Updated 8 months ago
- ☆24May 9, 2025Updated 9 months ago
- FongMi影视和tvbox配置文件,如果喜欢,请Fork自用。使用前请仔细阅读仓库说明,一旦使用将被视为你已了解。☆11Dec 25, 2023Updated 2 years ago
- Standalone commandline CLI tool for compiling Triton kernels☆20Sep 13, 2024Updated last year
- ☆16Mar 17, 2025Updated 10 months ago
- Torch 7 + Android port of Neural style algorithm☆10May 10, 2016Updated 9 years ago
- Sequence to sequence model for Arabic punctuation prediction.☆12Feb 13, 2020Updated 6 years ago
- The official repository of the Eesen project☆12Jun 20, 2018Updated 7 years ago
- ☆52May 19, 2025Updated 8 months ago
- Implementation from scratch in C of the Multi-head latent attention used in the Deepseek-v3 technical paper.