A Triton-only attention backend for vLLM
☆24Mar 17, 2026Updated last week
Alternatives and similar repositories for vllm-triton-backend
Users that are interested in vllm-triton-backend are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Development containers for triton and triton-cpu☆24Mar 9, 2026Updated 2 weeks ago
- ☆23Jul 11, 2025Updated 8 months ago
- A Triton JIT runtime and ffi provider in C++☆32Updated this week
- Wave: Python Domain-Specific Language for High Performance Machine Learning☆48Updated this week
- Automatic differentiation for Triton Kernels☆29Aug 12, 2025Updated 7 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆128Updated this week
- Ship correct and fast LLM kernels to PyTorch☆147Jan 14, 2026Updated 2 months ago
- Framework to reduce autotune overhead to zero for well known deployments.☆97Sep 19, 2025Updated 6 months ago
- Hack for start other istance of wpa_supplicant daemon☆13Nov 16, 2017Updated 8 years ago
- A lightweight triton-based General Matrix Multiplication (GEMM) library.☆55Updated this week
- VPP dataplane for Calico☆15Oct 23, 2020Updated 5 years ago
- Open-source evaluation toolkit of large vision-language models (LVLMs), support ~100 VLMs, 30+ benchmarks☆15Feb 17, 2025Updated last year
- OSTree-native container configs for a custom Fedora Kinoite for personal use☆16Jan 4, 2026Updated 2 months ago
- FlashTile is a CUDA Tile IR compiler that is compatible with NVIDIA's tileiras, targeting SM70 through SM121 NVIDIA GPUs.☆58Feb 6, 2026Updated last month
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Download Telegram chats as IRC-style TXT files. Optionally downloads media.☆27Jan 1, 2026Updated 2 months ago
- Julia implementation of the Flash Attention algorithm☆19Sep 4, 2023Updated 2 years ago
- WaferLLM: Large Language Model Inference at Wafer Scale☆96Jan 7, 2026Updated 2 months ago
- contents to be displayed at our projects-page☆17Dec 18, 2023Updated 2 years ago
- ☆13Jan 7, 2025Updated last year
- Parallel Optimization of Motion Estimation (ME) module based on CUDA☆16Mar 25, 2016Updated 10 years ago
- wpa_supplicant for Windows☆14Mar 23, 2024Updated 2 years ago
- Cute layout visualization☆33Jan 18, 2026Updated 2 months ago
- Boosting GPU utilization for LLM serving via dynamic spatial-temporal prefill & decode orchestration☆39Jan 8, 2026Updated 2 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Ampere optimized llama.cpp☆34Jan 30, 2026Updated last month
- my solution for UC Berkeley AI projects pacman☆11Jul 25, 2020Updated 5 years ago
- Music large model based on InternLM2-chat.☆23Dec 21, 2024Updated last year
- A website that showcases interesting projects, using Angular JS.☆12Mar 17, 2026Updated last week
- incubator repo for CUDA-TileIR backend☆122Mar 18, 2026Updated last week
- ☆12May 23, 2018Updated 7 years ago
- Nextcloud Sync Daemon for Kobo eReaders☆12May 17, 2025Updated 10 months ago
- Manages vllm-nccl dependency☆17Jun 3, 2024Updated last year
- DeepSeek-V3.2-Exp DSA Warmup Lightning Indexer training operator based on tilelang☆44Nov 19, 2025Updated 4 months ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆17Mar 26, 2025Updated last year
- TritonParse: A Compiler Tracer, Visualizer, and Reproducer for Triton Kernels☆197Updated this week
- Pure Triton kernels for Qwen3.5-27B inference on NVIDIA B200☆83Feb 28, 2026Updated 3 weeks ago
- ☆15Apr 28, 2023Updated 2 years ago
- C-compatible enum for Julia☆15Dec 23, 2023Updated 2 years ago
- ☆44Sep 8, 2025Updated 6 months ago
- Fork of Enzyme to work on Reverse-Mode Differentiation at the MLIR-level.☆11Apr 23, 2023Updated 2 years ago