Flash attention implementation Minimal CUDA implementation of Flash Attention with tiled computation and online softmax. Educational implementation based on Dao et al., 2022.
☆21Dec 27, 2025Updated 5 months ago
Alternatives and similar repositories for flash-attention-cuda
Users that are interested in flash-attention-cuda are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Security-native LLM system for AI-generated application security.☆263Jun 4, 2026Updated 2 weeks ago
- ☆19Oct 1, 2025Updated 8 months ago
- This project showcases a comprehensive analysis of CO2 emissions in a fictitious cheese manufacturing supply chain using both graph datab…☆11Sep 18, 2024Updated last year
- Material for the Design and Analysis of Algorithms course taught at Princess Sumaya University for Technology☆67Apr 7, 2026Updated 2 months ago
- Training NVIDIA NeMo Megatron Large Language Model (LLM) using NeMo Framework on Google Kubernetes Engine☆16Apr 28, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A showcase application demonstrating 8 agentic patterns from LangChain4j with real-time visualization using D3.js and WebSocket streaming…☆37May 6, 2026Updated last month
- ☆13Apr 27, 2026Updated last month
- Launch and configuration files for running Nav2 on MVsim worlds☆19Jun 5, 2026Updated 2 weeks ago
- Deploying Spark machine learning models to Azure☆15Mar 28, 2023Updated 3 years ago
- [AAAI 2025] Does VLM Classification Benefit from LLM Description Semantics?☆26Aug 5, 2025Updated 10 months ago
- EE6427 Video Signal Processing☆17Jan 14, 2021Updated 5 years ago
- ☆24Nov 22, 2019Updated 6 years ago
- A ROS Action server that handles communication with move base action server to achieve a list of required goal poses successively.☆19Sep 19, 2021Updated 4 years ago
- ☆27Sep 11, 2025Updated 9 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆15Oct 6, 2024Updated last year
- ☆14Aug 9, 2023Updated 2 years ago
- Native macOS developer tool for local project discovery, code analysis, dependency tracking, and Git insights.☆75Mar 31, 2026Updated 2 months ago
- Haskell bindings to Halide☆20Mar 18, 2024Updated 2 years ago
- ☆24Dec 12, 2024Updated last year
- A tiny scalar-valued autograd engine and neural network library (Karpathy course)☆12Mar 16, 2026Updated 3 months ago
- Won 2nd prize in HackUIET hackathon and best project in AI theme☆20Jan 1, 2025Updated last year
- ☆35Oct 31, 2025Updated 7 months ago
- A search index specialised for LaTeX equations. Developed for latexsearch.com.☆17Jul 15, 2011Updated 14 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- High performance implementation of Deep neuroevolution in pytorch using mpi4py. Intended for use on HPC clusters☆27Jan 24, 2022Updated 4 years ago
- Evaluate robustness of adaptation methods on large vision-language models☆19Aug 23, 2023Updated 2 years ago
- Personal Workspace setup☆26Sep 13, 2024Updated last year
- EXTD :: Extremely Tiny Face Detector via Iterative Filter Reuse☆13Jul 25, 2019Updated 6 years ago
- Automation in iOS, iPadOS and macOS☆22Jan 2, 2021Updated 5 years ago
- 🦄 Serving Platform for Spatial AI and Robotics.☆23Jun 19, 2025Updated 11 months ago
- Run RF-DETR on NVIDIA DeepStream☆52Updated this week
- ☆12Aug 6, 2022Updated 3 years ago
- CustomLLM config to leverage watsonx LLMs with continue.dev.☆17Aug 27, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Official Codebase for "Generative Multimodal Model Features Are Discriminative Vision-Language Classifiers"☆27Jun 7, 2025Updated last year
- ☆12Apr 19, 2024Updated 2 years ago
- ☆12Sep 21, 2023Updated 2 years ago
- Minimal JAX implementation unifying Diffusion and Flow Matching algorithms as alternative strategies for transporting data distributions.☆66Dec 19, 2025Updated 6 months ago
- 📝The official repository of "Rethinking Cross-Generator Image Forgery Detection through DINOv3"☆25Dec 2, 2025Updated 6 months ago
- Comfyui Node Pack☆31Sep 17, 2025Updated 9 months ago
- FLOPs and other statistics COunter for Pytorch neural networks☆22Apr 9, 2026Updated 2 months ago