dInfer: An Efficient Inference Framework for Diffusion Language Models
☆458Feb 11, 2026Updated 2 months ago
Alternatives and similar repositories for dInfer
Users that are interested in dInfer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Easy and Efficient dLLM Fine-Tuning☆247Mar 2, 2026Updated last month
- Official repository Flash Local Linear Attention☆23Apr 23, 2026Updated last week
- [NeurIPS 2025] Encoder-Decoder Diffusion Language Models for Efficient Training and Inference☆41Oct 29, 2025Updated 6 months ago
- ☆47Sep 8, 2025Updated 7 months ago
- SGLang Kernel Wheel Index☆22Apr 21, 2026Updated last week
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- d3LLM: Ultra-Fast Diffusion LLM 🚀☆120Updated this week
- Codes for DATA: Differentiable ArchiTecture Approximation.☆11Jul 22, 2021Updated 4 years ago
- DeeperGEMM: crazy optimized version☆86May 5, 2025Updated 11 months ago
- [ICLR 2026] Official code for TraceRL: Revolutionizing post-training for Diffusion LLMs, powering the SOTA TraDo series.☆494Jan 28, 2026Updated 3 months ago
- LLaDA2.0 is the diffusion language model series developed by InclusionAI team, Ant Group.☆406Feb 12, 2026Updated 2 months ago
- diffusers with search engine☆12Jan 13, 2026Updated 3 months ago
- Official PyTorch implementation for "Large Language Diffusion Models"☆3,758Nov 12, 2025Updated 5 months ago
- A curated list for awesome discrete diffusion models resources.☆553Sep 9, 2025Updated 7 months ago
- A CUDA kernel optimization toolkit for validation, benchmarking, Nsight Compute profiling, bottleneck analysis, and iterative tuning. It …☆130Apr 22, 2026Updated last week
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆55Apr 14, 2026Updated 2 weeks ago
- A prefill & decode disaggregated LLM serving framework with shared GPU memory and fine-grained compute isolation.☆125Dec 25, 2025Updated 4 months ago
- A forked version of flux-fast that makes flux-fast even faster with cache-dit, 3.3x speedup on NVIDIA L20.☆24Jul 18, 2025Updated 9 months ago
- Benchmark tests supporting the TiledCUDA library.☆18Nov 19, 2024Updated last year
- Model souping for LLMs☆73Nov 18, 2025Updated 5 months ago
- Vogent Turn: fast, open-source turn-detection for Voice AI applications☆49Oct 28, 2025Updated 6 months ago
- ☆21Mar 3, 2026Updated last month
- [ICLR 2025] SDTT: a simple and effective distillation method for discrete diffusion models☆50Feb 26, 2026Updated 2 months ago
- ☆36Mar 7, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Nsight Python is a Python kernel profiling interface based on NVIDIA Nsight Tools☆198Apr 24, 2026Updated last week
- ☆12Mar 17, 2024Updated 2 years ago
- Lightweight Python Wrapper for OpenVINO, enabling LLM inference on NPUs☆29Dec 17, 2024Updated last year
- ☆52May 19, 2025Updated 11 months ago
- ☆64Jul 11, 2025Updated 9 months ago
- Dream 7B, a large diffusion language model☆1,232Nov 21, 2025Updated 5 months ago
- Repo for SpecEE: Accelerating Large Language Model Inference with Speculative Early Exiting (ISCA25)☆72Apr 25, 2025Updated last year
- Discrete Diffusion Forcing (D2F): dLLMs Can Do Faster-Than-AR Inference☆249Feb 3, 2026Updated 2 months ago
- [ICLR 2026] Learning to Parallel: Accelerating Diffusion Large Language Models via Learnable Parallel Decoding☆32Jan 27, 2026Updated 3 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Code accompanying the NeurIPS 2019 paper AutoAssist: A Framework to Accelerate Training of Deep Neural Networks.☆14Oct 3, 2022Updated 3 years ago
- [ICLR 2026] Official repository of "Beyond Fixed: Training-Free Variable-Length Denoising for Diffusion Large Language Models"☆164Feb 16, 2026Updated 2 months ago
- High-performance distributed data shuffling (all-to-all) library for MoE training and inference☆119Mar 7, 2026Updated last month
- dLLM: Simple Diffusion Language Modeling☆2,432Apr 15, 2026Updated 2 weeks ago
- A Secure Version of DATAVIEW using SGX techniques.☆10Jul 6, 2021Updated 4 years ago
- ☆15Dec 2, 2019Updated 6 years ago
- How to plot for papers, slides, demos, etc.☆10Apr 7, 2022Updated 4 years ago