high-performance linear attention kernel library built on TileLang
☆363Apr 30, 2026Updated this week
Alternatives and similar repositories for FlashQLA
Users that are interested in FlashQLA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆11Jun 9, 2023Updated 2 years ago
- Quantized Attention on GPU☆44Nov 22, 2024Updated last year
- ☆11Apr 5, 2021Updated 5 years ago
- CS169.1x Software as a Service course offered by UC Berkeley at edx.org☆14Oct 28, 2014Updated 11 years ago
- Codes for MO's Trading☆15Mar 20, 2022Updated 4 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A parser for PTX 6.5☆13Jun 19, 2023Updated 2 years ago
- A cross-platform RISC-V interpreter that implements the RV32IMA instruction set.☆24Aug 23, 2022Updated 3 years ago
- self hosted responsive photo/album manager & server writen in nodejs, koa2, react, redux☆11May 25, 2017Updated 8 years ago
- Created a simple neural network using C++17 standard and the Eigen library that supports both forward and backward propagation.☆11Jul 27, 2024Updated last year
- This repository provides tutorial, which discusses running sample publisher and subscriber using multiple transports of point_cloud_trans…☆11Mar 17, 2026Updated last month
- ☆18Mar 18, 2024Updated 2 years ago
- 一个用Apple Metal实现的Llama和通义千问大模型本地推理☆10Apr 26, 2024Updated 2 years ago
- FlashSampling: Fast and Memory-Efficient Exact Sampling (https://huggingface.co/papers/2603.15854)☆70Apr 25, 2026Updated last week
- CenterNet3D 部署版本,便于移植不同平台(onnx、tensorRT、rknn、Horizon)。☆13May 24, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Nsight Python is a Python kernel profiling interface based on NVIDIA Nsight Tools☆198Apr 24, 2026Updated last week
- ☆15Nov 14, 2023Updated 2 years ago
- Substrate TypeScript SDK☆10Sep 20, 2024Updated last year
- Repository for "Training Language Models To Explain Their Own Computations"☆22Dec 22, 2025Updated 4 months ago
- ☆41Mar 31, 2022Updated 4 years ago
- ☆18Nov 22, 2025Updated 5 months ago
- Open deep learning compiler stack for cpu, gpu and specialized accelerators☆19Apr 24, 2026Updated last week
- unofficial implementation of YOLOP TensorRT☆12Dec 11, 2021Updated 4 years ago
- Open ABI and FFI for Machine Learning Systems☆383Updated this week
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆24Apr 29, 2025Updated last year
- TensorRT Acceleration for PyTorch Native Eager Mode Quantization Models☆17Jul 22, 2024Updated last year
- DMon Prototype for OSDI 2021 Artifact Evaluation☆24May 4, 2021Updated 4 years ago
- learn TensorRT from scratch🥰☆18Sep 29, 2024Updated last year
- Tutorials of Extending and importing TVM with CMAKE Include dependency.☆16Oct 11, 2024Updated last year
- cuda 加速3D点云算法库,持续更新(含cudaicp,glfw点云可视化等)☆16Aug 24, 2022Updated 3 years ago
- Automatically summarize lectures and ask questions about the course material☆13Apr 16, 2024Updated 2 years ago
- A simple, generic, and flexible keyframe animation library for Rust.☆30Mar 27, 2026Updated last month
- 一步步实现c++中的智能指针☆10Jun 6, 2021Updated 4 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- IndexCache: Accelerating Sparse Attention via Cross-Layer Index Reuse☆91Mar 14, 2026Updated last month
- TenniS: Tensor based Edge Neural Network Inference System☆15Feb 28, 2024Updated 2 years ago
- ☆20Sep 28, 2024Updated last year
- A CUDA kernel for NHWC GroupNorm for PyTorch☆23Nov 15, 2024Updated last year
- ☆12May 30, 2025Updated 11 months ago
- Allow torch tensor memory to be released and resumed later☆241Apr 20, 2026Updated last week
- ☆17May 14, 2025Updated 11 months ago