Collection of Acceleration Methods for Generative AI
☆29Dec 9, 2025Updated 3 months ago
Alternatives and similar repositories for Awesome-Acceleration-GenAI
Users that are interested in Awesome-Acceleration-GenAI are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [NeurIPS 2025 Spotlight] LeMiCa: Lexicographic Minimax Path Caching for Efficient Diffusion-Based Video Generation☆105Feb 12, 2026Updated last month
- simplest online-softmax notebook for explain Flash Attention☆16Jan 27, 2026Updated 2 months ago
- A curated list of research papers, resources, and advancements on Diffusion Cache and related efficient diffusion model acceleration tech…☆78Nov 4, 2025Updated 4 months ago
- FlashTile is a CUDA Tile IR compiler that is compatible with NVIDIA's tileiras, targeting SM70 through SM121 NVIDIA GPUs.☆58Feb 6, 2026Updated last month
- Official repository for the paper Local Linear Attention: An Optimal Interpolation of Linear and Softmax Attention For Test-Time Regressi…☆23Oct 1, 2025Updated 5 months ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- 个人论文阅读笔记,记录了所有读过的论文总结,基本每天更新。☆16Nov 6, 2021Updated 4 years ago
- A web-based tool for diagnosing ensemble few-shot classifiers☆15Sep 13, 2022Updated 3 years ago
- ☆20Oct 5, 2025Updated 5 months ago
- DatasetResearch: Benchmarking Agent Systems for Demand-Driven Dataset Discovery☆20Sep 24, 2025Updated 6 months ago
- Awesome list for High Performance Computing / Parallel Computing resources.☆12Sep 20, 2017Updated 8 years ago
- ☆11May 3, 2019Updated 6 years ago
- A PyTorch-native inference engine with hybrid cache acceleration and massive parallelism for DiTs.☆1,108Updated this week
- [ICLR 26] The official code repository for the paper "Mirage or Method? How Model–Task Alignment Induces Divergent RL Conclusions".☆17Feb 9, 2026Updated last month
- ☆20Sep 11, 2025Updated 6 months ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- code for EACL2024-main:Generative Dense Retrieval: Memory Can Be a Burden☆32Jan 19, 2024Updated 2 years ago
- Sampling techniques for Candle.☆19Apr 3, 2024Updated last year
- Blogs that I'm actively following.☆14Sep 17, 2023Updated 2 years ago
- Empowering LLM Agents for Real-World Computer System Optimization☆17Sep 10, 2025Updated 6 months ago
- Fast and memory-efficient exact attention☆21Mar 13, 2026Updated 2 weeks ago
- A std::execution style runtime context and High Performance RPC Transport for using OpenUCX. Including CUDA/ROCM/... devices with RDMA.☆30Feb 22, 2026Updated last month
- 3rd party dependencies for DALI project☆11Mar 10, 2026Updated 2 weeks ago
- A survey for visual generation alignment☆128Nov 9, 2025Updated 4 months ago
- A coding agent for the browser☆21Jan 8, 2026Updated 2 months ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆10Dec 25, 2022Updated 3 years ago
- 😎 Awesome papers on token redundancy reduction☆11Mar 12, 2025Updated last year
- Collected the world's best computer vision labs and lecture materials.☆14Feb 23, 2025Updated last year
- High performance RMSNorm Implement by using SM Core Storage(Registers and Shared Memory)☆30Jan 22, 2026Updated 2 months ago
- OAuth authentication plugin for personal coding assistance with ChatGPT Plus/Pro subscriptions - uses OpenAI's official authentication me…☆28Mar 21, 2026Updated last week
- EXL2 quantization generalized to other models.☆10Mar 17, 2024Updated 2 years ago
- 使用torch.distributed实现DP/TP/PP☆13Dec 28, 2023Updated 2 years ago
- ☆14Jul 23, 2025Updated 8 months ago
- OpenSFEDS, a near-eye gaze estimation dataset containing approximately 2M synthetic camera-photosensor image pairs sampled at 500 Hz unde…☆13Apr 18, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- A very simple "hello world" project for deploying Prefect 2 to a docker container on Google Compute Engine.☆11Aug 16, 2022Updated 3 years ago
- Debug print operator for cudagraph debugging☆14Aug 2, 2024Updated last year
- (ICCV2025) EEdit⚡: Rethinking the Spatial and Temporal Redundancy for Efficient Image Editing☆60Sep 17, 2025Updated 6 months ago
- T22_034_han_shi_hao_CRDDC_2022_SourceCode☆11Dec 29, 2023Updated 2 years ago
- Attempt to make multiple residual streams from Bytedance's Hyper-Connections paper accessible to the public☆173Feb 4, 2026Updated last month
- HDF5 Performance Analysis Checklist☆13Dec 23, 2024Updated last year
- GPU methods for alpha matting, including cutting edge research algorithms by Philip G. Lee.☆12Jan 8, 2014Updated 12 years ago