Fast and efficient attention method exploration and implementation.
☆25Mar 25, 2025Updated last year
Alternatives and similar repositories for FlashMLA
Users that are interested in FlashMLA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- An NVIDIA AI Workbench Example Project for Finetuning Llama 2☆35Aug 29, 2024Updated last year
- Emulating DMA Engines on GPUs for Performance and Portability☆41May 17, 2015Updated 10 years ago
- DLBlas: clean and efficient kernels☆36Updated this week
- ☆119May 16, 2025Updated 10 months ago
- A series of high-performance GEMM (General Matrix Multiply) implementations Iteratively optimised for H100 GPUs in Pure CUDA.☆76Feb 18, 2026Updated last month
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- LUFS and True Peak metering (app+plug)☆11Feb 14, 2016Updated 10 years ago
- Notes and artifacts from the ONNX steering committee☆28Updated this week
- ☆15Feb 13, 2018Updated 8 years ago
- ☆59Nov 21, 2024Updated last year
- A Vote-and-Verify Strategy for Fast Spatial Verification in Image Retrieval☆19Jun 7, 2017Updated 8 years ago
- Export Blender (2.4x) curves to TikZ format for use with TeX☆13Apr 18, 2014Updated 11 years ago
- A tool to detect infrastructure issues on cloud native AI systems☆53Sep 18, 2025Updated 6 months ago
- ONCache: A Cache-Based Low-Overhead Container Overlay Network☆21Jun 7, 2025Updated 10 months ago
- ☆13Aug 1, 2025Updated 8 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Artifacts of EVT ASPLOS'24☆30Mar 6, 2024Updated 2 years ago
- EPOCH Input System Version 2☆10Jun 5, 2020Updated 5 years ago
- ☆74Apr 2, 2026Updated last week
- ☆17Dec 9, 2024Updated last year
- High-performance LLM operator library built on TileLang.☆98Updated this week
- Load your graph of bookmarks and tags into a neo4j database and explore it☆14Jan 1, 2017Updated 9 years ago
- ☆16Oct 13, 2023Updated 2 years ago
- Global Address SPace toolbox -- Julia wrapper☆10Nov 17, 2017Updated 8 years ago
- Matrix Algebra on GPU and Multicore Architectures (MAGMA) source releases from http://icl.cs.utk.edu/magma/index.html☆25Jun 4, 2015Updated 10 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- G'MIC-Qt is a versatile front-end to the image processing framework G'MIC.☆17Mar 18, 2026Updated 3 weeks ago
- A flexible resource compiler similar to bin2h and qrc☆20Apr 18, 2020Updated 5 years ago
- Prototype for a SPIR-V assembler and dissasembler. It provides a composable Java interface for generating SPIR-V code at runtime.☆13Oct 31, 2025Updated 5 months ago
- 🐝 Tiny CLI to post simultaneously to Mastodon and Bluesky☆17Apr 3, 2026Updated last week
- Card game of War written in Elixir and Rust.☆11Jun 18, 2022Updated 3 years ago
- AMD’s C++ library for accelerating tensor primitives☆49Apr 1, 2026Updated last week
- ☆16Sep 27, 2018Updated 7 years ago
- Continuum Dynamics Evaluation and Test Suite☆15Aug 29, 2017Updated 8 years ago
- Cognitive Science 2 exam project☆12Jun 3, 2018Updated 7 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Phi-2 Colab Notebook☆14Dec 14, 2023Updated 2 years ago
- Reconstruction of distorted underwater images using robust registration☆15Apr 16, 2019Updated 6 years ago
- Html interface for GLPK.js solver☆14Aug 11, 2020Updated 5 years ago
- ext_mpi_collectives☆11Mar 27, 2026Updated 2 weeks ago
- Argonne Leadership Computing Facility OpenCL tutorial☆10Aug 22, 2025Updated 7 months ago
- Parallel SpMV using CSR representation, built in CUDA☆14Jun 27, 2020Updated 5 years ago
- The GNU MathProg implementation of OSeMOSYS☆12Nov 7, 2024Updated last year