Flash Attention 2 implementation for Turing GPUs
☆109Mar 23, 2026Updated 3 months ago
Alternatives and similar repositories for flash-attention-turing
Users that are interested in flash-attention-turing are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A static deobfuscator for JavaScript Malware☆13May 6, 2020Updated 6 years ago
- Lsglang is a special extension of sglang that fully utilizes CPU and GPU computing resources with an efficient GPU parallel + NUMA parall…☆83Jun 22, 2026Updated last week
- [ICML 2025] Adaptive Self-improvement LLM Agentic System for ML Library Development☆17Jan 6, 2026Updated 5 months ago
- ☆12Mar 21, 2024Updated 2 years ago
- A robust Node.js proxy server that automatically rotates API keys for Gemini and OpenAI APIs when rate limits (429 errors) are encountere…☆61Apr 10, 2026Updated 2 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Communicate undetected in plain sight using zero width obfuscation.☆15Nov 5, 2021Updated 4 years ago
- ☆14Sep 4, 2024Updated last year
- ☆15Dec 21, 2025Updated 6 months ago
- 基于EventLoop和多线程的morden cpp 的linux网络库☆11Apr 5, 2020Updated 6 years ago
- Standalone Flash Attention v2 kernel without libtorch dependency☆113Sep 10, 2024Updated last year
- Tries to UI development. Clone of https://www.perplexity.ai/☆11Sep 30, 2023Updated 2 years ago
- 仅供自用☆11Updated this week
- Implementations of different neural network pruning techniques☆14Aug 10, 2023Updated 2 years ago
- 基于 MisakaTranslator 的互动文字小说阅读工具。☆12Feb 22, 2026Updated 4 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆11Apr 5, 2024Updated 2 years ago
- Cellular Automata - Pokemon Type Battle Simulation☆11Oct 26, 2024Updated last year
- My old book about programming for Symbian 9.x based smartphones in russian☆14Jul 8, 2015Updated 10 years ago
- A fast compressor/decompressor☆15Nov 18, 2024Updated last year
- Semantic Scaffolds for Pseudocode-to-Code Generation (accepted by ACL 2020)☆14Jun 7, 2021Updated 5 years ago
- A flexible data structure for multi-input multi-output models☆10Oct 12, 2021Updated 4 years ago
- VGA LCD Core (OpenCores)☆15May 22, 2018Updated 8 years ago
- ☆14May 28, 2019Updated 7 years ago
- JAX implementation of GPTQ quantization algorithm☆10Jul 19, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- PyMT4 - Python bindings for the Metatrader 4 trading platform Project origin By rmawatson, he didn't want to be disturbed, So don't to …☆13Aug 6, 2018Updated 7 years ago
- GPU accelerated Perlin Noise in python☆11Oct 23, 2020Updated 5 years ago
- Batched routines (BLAS, LAPACK, etc.) for multi-dimensional arrays☆12Apr 10, 2022Updated 4 years ago
- SymPy with PythonCall backend (not PyCall)☆12Feb 19, 2025Updated last year
- Towards a million-node RISC-V cluster.☆14Mar 6, 2025Updated last year
- Julia Channels with defined length: Buffered and threaded iterators for machine learning.☆12Dec 13, 2020Updated 5 years ago
- Mixed-precision quantization for LLMs. Every layer refracts into a different format based on its sensitivity. Native compressed-tensors e…☆82Updated this week
- A repository for managing public, versioned releases of the Swedish Parliament Corpus.☆15Updated this week
- A collection of various custom nodes for ComfyUI (Work in progress)☆14Jun 9, 2025Updated last year
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Disk based, buffered data structures for machine learning☆12Jan 4, 2021Updated 5 years ago
- ☆19Apr 19, 2024Updated 2 years ago
- Dive into Deep Learning, with Julia programming language and Flux.jl.☆11Oct 28, 2024Updated last year
- ☆14Oct 7, 2024Updated last year
- Artifacts of EVT ASPLOS'24☆30Mar 6, 2024Updated 2 years ago
- ☆13Jul 15, 2022Updated 3 years ago
- Llama causal LM fully recreated in LibTorch. Designed to be used in Unreal Engine 5☆16Sep 19, 2024Updated last year