Performance of the C++ interface of flash attention and flash attention v2 in large language model (LLM) inference scenarios.
☆15Aug 31, 2023Updated 2 years ago
Alternatives and similar repositories for flash_attention_inference
Users that are interested in flash_attention_inference are comparing it to the libraries listed below
Sorting:
- Standalone Flash Attention v2 kernel without libtorch dependency☆114Sep 10, 2024Updated last year
- CPU Memory Compiler and Parallel programing☆26Nov 18, 2024Updated last year
- Vstream - Video Analytics pipeline with Hardware based accelerations (dev - stage)☆10Feb 2, 2024Updated 2 years ago
- Bayesian Optimization Excutable and Visualizable Application☆10Aug 14, 2023Updated 2 years ago
- A translator from c to MLIR☆33Nov 15, 2021Updated 4 years ago
- This project is about convolution operator optimization on GPU, include GEMM based (Implicit GEMM) convolution.☆43Sep 29, 2025Updated 5 months ago
- Implement Flash Attention using Cute.☆102Dec 17, 2024Updated last year
- ☆12Feb 7, 2018Updated 8 years ago
- Decoding Attention is specially optimized for MHA, MQA, GQA and MLA using CUDA core for the decoding stage of LLM inference.☆46Jun 11, 2025Updated 8 months ago
- 使用 cutlass 实现 flash-attention 精简版,具有教学意义☆58Aug 12, 2024Updated last year
- Fast, lightweight and cross-platform code-editor☆13Mar 2, 2026Updated last week
- ☆10May 24, 2021Updated 4 years ago
- A hackable library for running and fine-tuning modern transformer models on commodity and alternative GPUs, powered by tinygrad.☆28Feb 10, 2026Updated 3 weeks ago
- Array Based Half-Facet (AHF) Mesh Data Structure (for simplex meshes)☆10Aug 29, 2020Updated 5 years ago
- Experiments with reasoning models, training techniques, papers☆25Updated this week
- Intel oneAPI RenderKit CMake superbuild☆13Jan 13, 2026Updated last month
- 2014: Variational Monte Carlo for the harmonic oscillator, helium, hydrogen and H2 - IPython notebook and FORTRAN90☆13Jun 23, 2016Updated 9 years ago
- ☆14Dec 9, 2021Updated 4 years ago
- Procedural city generation.☆13Oct 15, 2022Updated 3 years ago
- NeRF (Neural Radiance Field) TensorFlow v2 Keras Re-Implementation☆11Dec 15, 2022Updated 3 years ago
- Quantize yolov7 using pytorch_quantization.🚀🚀🚀☆12Oct 20, 2023Updated 2 years ago
- Fast SGEMM emulation on Tensor Cores☆17Feb 16, 2025Updated last year
- A YellowPage scraper is a Python program/script that extracts data from the YellowPages.com website using the Python programming language…☆11Apr 14, 2023Updated 2 years ago
- Qualitative evaluation of automatic chord extraction results: analysis of the musical relationships between predicted chords and target c…☆10Oct 25, 2021Updated 4 years ago
- Graduate Student Resume (MIT inspired resume format)☆10Aug 5, 2020Updated 5 years ago
- [NAACL 2021] Reading and Acting while Blindfolded: The Need for Semantics in Text Game Agents☆11May 31, 2021Updated 4 years ago
- propositional satisfiability problem (SAT) goes neural and deep☆12Aug 17, 2021Updated 4 years ago
- Flash Attention in ~100 lines of CUDA (forward pass only)☆11Jun 10, 2024Updated last year
- A research-purpose software for 3D morphing between two meshes with arbitrary connectivity.☆11Sep 2, 2025Updated 6 months ago
- ☆15Mar 2, 2026Updated last week
- Tutorial covering event driven web component. How to start and in general explaining how you can make your single-page app or any type of…☆11Mar 7, 2023Updated 3 years ago
- Implementation of the first neural natural logic paper on natural language inference☆11Oct 31, 2022Updated 3 years ago
- ☆21May 17, 2025Updated 9 months ago
- Text-guided 3D texture generation using training-free multi-diffusion in UV space.☆14Apr 7, 2025Updated 11 months ago
- [3DV 2025] CoE: Deep Coupled Embedding for Non-Rigid Point Cloud Correspondences☆18Jan 5, 2026Updated 2 months ago
- ☆19Updated this week
- Google's MediaPipe (v0.8.9) and Python Wheel installer for Jetson Nano (JetPack 4.6) compiled for CUDA 10.2☆16Jun 7, 2023Updated 2 years ago
- Fluid sounds, such as splashing and pouring, are ubiquitous and familiar but we lack physically based algorithms to synthesize them in co…☆12Apr 25, 2017Updated 8 years ago
- Official repository for paper "Goal-Aware Neural SAT Solver"☆17Jun 10, 2023Updated 2 years ago