📖A curated list of Awesome LLM/VLM Inference Papers with codes: WINT8/4, FlashAttention, PagedAttention, MLA, Parallelism, etc. 🎉🎉
☆15Mar 30, 2025Updated last year
Alternatives and similar repositories for Awesome-LLM-Inference
Users that are interested in Awesome-LLM-Inference are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- In-depth tutorials and examples on LLM training and inference infrastructure, such as, Pytorch, Fairscale, Nvidia AI Modules (cuDNN, tens…☆22May 19, 2025Updated 10 months ago
- Complete list of new features for MySQL 5.7☆20Jan 30, 2019Updated 7 years ago
- ☆12May 3, 2025Updated 11 months ago
- A high-performance attention mechanism that computes softmax normalization in a single streaming pass using running accumulators (online …☆29Oct 11, 2025Updated 6 months ago
- AI Infra LLM infer/ tensorrt-llm/ vllm☆24Mar 7, 2026Updated last month
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- PiKV: KV Cache Management System for Mixture of Experts [Efficient ML System]☆49Feb 24, 2026Updated last month
- Prototypes and experiments for WG Device Management.☆15Apr 1, 2026Updated last week
- Curso de Programacion con Solidity para crear Criptomonedas con Ethereum y Polygon☆11Aug 1, 2023Updated 2 years ago
- A class for synchronizing sensor readings to the system clock☆11Oct 25, 2018Updated 7 years ago
- Minimal PyTorch implementation of TP, SP, FSDP and sharded-EMA☆32Nov 27, 2025Updated 4 months ago
- Backend del sitio solopython☆12Jun 11, 2023Updated 2 years ago
- [ICLR 2026] Official PyTorch implementation for "ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding"☆61Dec 26, 2025Updated 3 months ago
- ☆31Aug 18, 2025Updated 7 months ago
- Code for Radar HRRP Target Recognition Based on Variational Auto-encoder with Learnable Prior☆11Nov 15, 2025Updated 4 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- RCS Radar Simulator for Matlab☆12Dec 5, 2017Updated 8 years ago
- Frontend para la pagina web de SoloPython