peichenxie/FPRev

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/peichenxie/FPRev)

peichenxie / FPRev

☆26

Alternatives and similar repositories for FPRev

Users that are interested in FPRev are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

foundation-model-stack / vllm-triton-backend
View on GitHub
A Triton-only attention backend for vLLM
☆27Updated this week
KuangjuX / AttnLink
View on GitHub
An experimental communicating attention kernel based on DeepEP.
☆34Jul 29, 2025Updated 11 months ago
ColfaxResearch / cfx-article-src
View on GitHub
☆192May 7, 2025Updated last year
tile-ai / tilescale
View on GitHub
Tile-based language built for AI computation across all scales
☆173Jun 16, 2026Updated last month
bcarlet / ptx-math
View on GitHub
☆20Jan 1, 2023Updated 3 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
lsds / Tempo
View on GitHub
Tempo is a system for declarative, efficient, end-to-end compiled dynamic deep learning
☆30Oct 21, 2025Updated 9 months ago
SJTU-IPADS / MetaAttention
View on GitHub
MetaAttention: A Unified and Performant Attention Framework Across Hardware Backends(PPoPP'26)
☆16Dec 31, 2025Updated 6 months ago
muriloboratto / NVSHEMEM
View on GitHub
Sample Codes using NVSHMEM on Multi-GPU
☆30Jan 22, 2023Updated 3 years ago
Mogball / triton_lite
View on GitHub
☆20May 24, 2025Updated last year
ColfaxResearch / layout-categories
View on GitHub
This repository contains companion software for the Colfax Research paper "Categorical Foundations for CuTe Layouts".
☆139Sep 24, 2025Updated 9 months ago
tile-ai / TileFoundry
View on GitHub
☆54Updated this week
eunomia-bpf / nccl-eBPF
View on GitHub
☆20Jul 7, 2026Updated 2 weeks ago
togethercomputer / ParallelKernelBench
View on GitHub
☆41Jul 1, 2026Updated 2 weeks ago
flagos-ai / DeepSeek-V4-FlagOS
View on GitHub
☆16Updated this week
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
tile-ai / AttentionEngine
View on GitHub
☆52May 19, 2025Updated last year
ryanmacdonald / Ray-Tracing-GPU
View on GitHub
RTL implementation of a ray-tracing GPU
☆16Dec 18, 2012Updated 13 years ago
Siddharth13s / RISC-V_Synthesis_and_Physical_Design
View on GitHub
Synthesis using Synopsys DC and Physical Design flow using Synopsys ICC II, of my RISC-V 5 stage pipelined using 32 nm tech repo
☆15Jul 31, 2024Updated last year
TiledTensor / TiledKernel
View on GitHub
TiledKernel is a code generation library based on macro kernels and memory hierarchy graph data structure.
☆19May 12, 2024Updated 2 years ago
cnrv / CNRV-FPU
View on GitHub
Basic floating-point components for RISC-V processors
☆69Dec 4, 2019Updated 6 years ago
MeshInfra / WaferLLM
View on GitHub
WaferLLM: Large Language Model Inference at Wafer Scale
☆112Jun 12, 2026Updated last month
FdyCN / PTX-ISA
View on GitHub
CUDA PTX-ISA Document 中文翻译版
☆56Sep 29, 2025Updated 9 months ago
ROCm / tritonBLAS
View on GitHub
A lightweight triton-based General Matrix Multiplication (GEMM) library.
☆65Jun 13, 2026Updated last month
TiledTensor / TiledBench
View on GitHub
Benchmark tests supporting the TiledCUDA library.
☆19Nov 19, 2024Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
marsupialtail / gpu-sparsert
View on GitHub
☆18Oct 15, 2020Updated 5 years ago
yester31 / Cutlass_EX
View on GitHub
study of cutlass
☆22Nov 10, 2024Updated last year
yonsei-hpcp / gcstack_gcscaler
View on GitHub
☆16Jun 17, 2025Updated last year
SemiAnalysisAI / microbench-blackwell
View on GitHub
☆121May 10, 2026Updated 2 months ago
wzc810049078 / ZC-RISCV-CORE
View on GitHub
ZC RISCV CORE
☆12Dec 19, 2019Updated 6 years ago
cchan / fp8_mul
View on GitHub
A tiny FP8 multiplication unit written in Verilog. TinyTapeout 2 submission.
☆14Nov 23, 2022Updated 3 years ago
ROCm / aotriton
View on GitHub
Ahead of Time (AOT) Triton Math Library
☆100Jul 13, 2026Updated last week
THU-DSP-LAB / ventus-env
View on GitHub
Ventus GPGPU develop environment
☆16Updated this week
ademeure / cuda-side-boost
View on GitHub
☆60Feb 24, 2026Updated 4 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
xiao-tai / ics2021
View on GitHub
Complete NEMU based on RISC-V instruction set
☆14Apr 10, 2024Updated 2 years ago
north-numerical-computing / tensor-cores-numerical-behavior
View on GitHub
Test suite for probing the numerical behavior of NVIDIA tensor cores
☆42Jul 24, 2024Updated last year
milkv-duo / cvitek-tdl-sdk-cv180x
View on GitHub
TDL-SDK samples for SDK V1 Duo(CV180X)
☆14Feb 6, 2024Updated 2 years ago
lenLRX / AmpereSparseMatmul
View on GitHub
study of Ampere' Sparse Matmul
☆18Jan 10, 2021Updated 5 years ago
serdes21 / flashtile
View on GitHub
FlashTile is a CUDA Tile IR compiler that is compatible with NVIDIA's tileiras, targeting SM70 through SM121 NVIDIA GPUs.
☆61Feb 6, 2026Updated 5 months ago
ROCm / amd_matrix_instruction_calculator
View on GitHub
A tool for generating information about the matrix multiplication instructions in AMD Radeon™ and AMD Instinct™ accelerators
☆139Apr 10, 2026Updated 3 months ago
YJMSTR / flash-linear-attention
View on GitHub
FLA but cuTile
☆27Apr 17, 2026Updated 3 months ago