ShaYeBuHui01 / flash_attention_inferenceView external linksLinks
Performance of the C++ interface of flash attention and flash attention v2 in large language model (LLM) inference scenarios.
☆16Aug 31, 2023Updated 2 years ago
Alternatives and similar repositories for flash_attention_inference
Users that are interested in flash_attention_inference are comparing it to the libraries listed below
Sorting:
- Standalone Flash Attention v2 kernel without libtorch dependency☆114Sep 10, 2024Updated last year
- Performance of the C++ interface of flash attention and flash attention v2 in large language model (LLM) inference scenarios.☆44Feb 27, 2025Updated 11 months ago
- Flash Attention in raw Cuda C beating PyTorch☆37May 14, 2024Updated last year
- Vstream - Video Analytics pipeline with Hardware based accelerations (dev - stage)☆10Feb 2, 2024Updated 2 years ago
- This project is about convolution operator optimization on GPU, include GEMM based (Implicit GEMM) convolution.☆43Sep 29, 2025Updated 4 months ago
- Implement Flash Attention using Cute.☆100Dec 17, 2024Updated last year
- Explains the conclusions of a logic program.☆10May 25, 2023Updated 2 years ago
- Decoding Attention is specially optimized for MHA, MQA, GQA and MLA using CUDA core for the decoding stage of LLM inference.☆46Jun 11, 2025Updated 8 months ago
- ☆11Dec 22, 2024Updated last year
- A hackable library for running and fine-tuning modern transformer models on commodity and alternative GPUs, powered by tinygrad.☆27Nov 27, 2025Updated 2 months ago
- Array Based Half-Facet (AHF) Mesh Data Structure (for simplex meshes)☆10Aug 29, 2020Updated 5 years ago
- Intel oneAPI RenderKit CMake superbuild☆12Jan 13, 2026Updated last month
- Estimate depth from surface normal.☆12Aug 14, 2020Updated 5 years ago
- ☆18Updated this week
- ☆10May 24, 2021Updated 4 years ago
- Retargetable ML compilers for the twenty-first century!☆13Apr 22, 2025Updated 9 months ago
- Fast, lightweight and cross-platform code-editor☆13Jan 31, 2026Updated 2 weeks ago
- The TextWorld KG Dataset from the paper Building Dynamic Knowledge Graphs from Text-based Games☆10Mar 11, 2020Updated 5 years ago
- Procedural city generation.☆13Oct 15, 2022Updated 3 years ago
- propositional satisfiability problem (SAT) goes neural and deep☆12Aug 17, 2021Updated 4 years ago
- 2014: Variational Monte Carlo for the harmonic oscillator, helium, hydrogen and H2 - IPython notebook and FORTRAN90☆13Jun 23, 2016Updated 9 years ago
- NeRF (Neural Radiance Field) TensorFlow v2 Keras Re-Implementation☆11Dec 15, 2022Updated 3 years ago
- ☆15Updated this week
- [NAACL 2021] Reading and Acting while Blindfolded: The Need for Semantics in Text Game Agents☆11May 31, 2021Updated 4 years ago
- ☆14Oct 9, 2022Updated 3 years ago
- ☆14Dec 9, 2021Updated 4 years ago
- Fast SGEMM emulation on Tensor Cores☆17Feb 16, 2025Updated last year
- fork of karparthy's nanogpt with custom datasets☆10Jul 25, 2023Updated 2 years ago
- Tutorial covering event driven web component. How to start and in general explaining how you can make your single-page app or any type of…☆11Mar 7, 2023Updated 2 years ago
- SandLogic Lexicons☆20Sep 11, 2025Updated 5 months ago
- Sparse Boolean linear algebra for Nvidia Cuda, OpenCL and CPU computations☆16Aug 19, 2022Updated 3 years ago
- Official repository for paper "Goal-Aware Neural SAT Solver"☆17Jun 10, 2023Updated 2 years ago
- ☆16Feb 9, 2026Updated last week
- a simple Flash Attention v2 implementation with ROCM (RDNA3 GPU, roc wmma), mainly used for stable diffusion(ComfyUI) in Windows ZLUDA en…☆51Aug 25, 2024Updated last year
- Add on for blender that allows the user to generate landscape by simulating tectonic phenomena. It requires the official blender Add-On "…☆11Nov 24, 2020Updated 5 years ago
- A vector field rendering library☆17Jul 31, 2019Updated 6 years ago
- A service that parses a sentence using AMR and returns a set of Verbnet logic predicates grounded with the roles of the input sentence.☆13Apr 6, 2022Updated 3 years ago
- IREE compiler and runtime for Snitch☆14Oct 9, 2025Updated 4 months ago
- Fluid sounds, such as splashing and pouring, are ubiquitous and familiar but we lack physically based algorithms to synthesize them in co…☆12Apr 25, 2017Updated 8 years ago