Fast and efficient attention method exploration and implementation.
☆25Mar 25, 2025Updated 11 months ago
Alternatives and similar repositories for FlashMLA
Users that are interested in FlashMLA are comparing it to the libraries listed below
Sorting:
- A series of high-performance GEMM (General Matrix Multiply) implementations Iteratively optimised for H100 GPUs in Pure CUDA.☆70Feb 18, 2026Updated last week
- Artifacts of EVT ASPLOS'24☆29Mar 6, 2024Updated last year
- Notes and artifacts from the ONNX steering committee☆28Updated this week
- PARADIS, a lightweight and flexible weather forecast model that tries to Keep It Simple.☆26Feb 4, 2026Updated 3 weeks ago
- ext_mpi_collectives☆11Apr 1, 2025Updated 11 months ago
- OpenMP offload playground☆10Nov 16, 2024Updated last year
- ☆11May 7, 2022Updated 3 years ago
- ☆11Feb 27, 2024Updated 2 years ago
- Argonne Leadership Computing Facility OpenCL tutorial☆10Aug 22, 2025Updated 6 months ago
- EPOCH Input System Version 2☆10Jun 5, 2020Updated 5 years ago
- Home Page☆16Oct 4, 2025Updated 4 months ago
- ☆10Feb 5, 2026Updated 3 weeks ago
- ☆115May 16, 2025Updated 9 months ago
- A tool to detect infrastructure issues on cloud native AI systems☆52Sep 18, 2025Updated 5 months ago
- Reconstruction of distorted underwater images using robust registration☆15Apr 16, 2019Updated 6 years ago
- ExaWorks SDK☆11Feb 1, 2024Updated 2 years ago
- Elixir library to produce diff reports of two strings, or two lists of objects (of any kind)☆10Apr 28, 2025Updated 10 months ago
- OpenVINO LLM Benchmark☆11Dec 7, 2023Updated 2 years ago
- Global Address SPace toolbox -- Julia wrapper☆10Nov 17, 2017Updated 8 years ago
- SODECL is a library of ordinary differential equation (ODE) and stochastic differential equation (SDE) solvers in OpenCL.☆11Jul 4, 2020Updated 5 years ago
- Enhancing the convergence speed by 2x and improving the training success of Physics-Informed Neural Networks (PINNs).☆13Oct 14, 2024Updated last year
- Workshop materials for AI Engineer World's Fair☆14Jun 3, 2025Updated 8 months ago
- Dependencies Upgrade with multi-agents (CrewAI & Langgraph)☆11Sep 9, 2024Updated last year
- ☆12Aug 4, 2025Updated 6 months ago
- Network streaming of kinect depth data using gstreamer to southcast servers☆15Oct 25, 2011Updated 14 years ago
- An easy and flexible mathematical programming environment for Python.☆12Jun 16, 2018Updated 7 years ago
- LaTex template for ITMO style presentations☆10Jan 19, 2025Updated last year
- Scripts for viewing Slurm batch job resource usages☆11Jan 3, 2022Updated 4 years ago
- FMS Model Optimizer is a framework for developing reduced precision neural network models.☆21Updated this week
- Need to generate a bunch of TileMill projects that are nearly identical and then render them all out? Want to script that? We gotcha cove…☆32Jul 29, 2015Updated 10 years ago
- Persistent dense gemm for Hopper in `CuTeDSL`☆15Aug 9, 2025Updated 6 months ago
- The GNU MathProg implementation of OSeMOSYS☆11Nov 7, 2024Updated last year
- Reference implementation for the climate segmentation benchmark, based on the Exascale Deep Learning for Climate Analytics work☆10May 6, 2020Updated 5 years ago
- A toolkit for developers to simplify the transformation of nn.Module instances. It's now corresponding to Pytorch.fx.☆13Apr 7, 2023Updated 2 years ago
- Export Blender (2.4x) curves to TikZ format for use with TeX☆13Apr 18, 2014Updated 11 years ago
- Terraform template to deploy IBM Spectrum Scale on Oracle Cloud Infrastructure (OCI)☆10Aug 21, 2025Updated 6 months ago
- Code files that accompany my Substack posts☆24Updated this week
- ☆11Apr 24, 2025Updated 10 months ago
- Python routines for parallel analysis of large MITgcm simulations☆12Jun 23, 2016Updated 9 years ago