ash80 / diffusion-gptLinks
From baby GPT to diffusion GPT: An annotated implementation of a character-level discrete diffusion model (adapted from Karpathy’s baby GPT).
☆242Updated 3 months ago
Alternatives and similar repositories for diffusion-gpt
Users that are interested in diffusion-gpt are comparing it to the libraries listed below
Sorting:
- ☆301Updated 5 months ago
- Inference, Fine Tuning and many more recipes with Gemma family of models☆276Updated 5 months ago
- The code repository of the paper: Competition and Attraction Improve Model Fusion☆168Updated 4 months ago
- One-click 3D Gaussian Splatting generation from a single image.☆51Updated this week
- Examples, end-2-end tutorials and apps built using Liquid AI Foundational Models (LFM) and the LEAP SDK☆818Updated last week
- Anemoi: A Semi-Centralized Multi-agent Systems Based on Agent-to-Agent Communication MCP server from Coral Protocol☆370Updated 4 months ago
- ☆126Updated 3 months ago
- ☆127Updated 4 months ago
- ~950 line, minimal, extensible LLM inference engine built from scratch.☆241Updated this week
- The official code implementation for "Cache-to-Cache: Direct Semantic Communication Between Large Language Models"☆312Updated this week
- A multi-agent LLM system for detecting and resolving cognitive dissonance.☆271Updated 2 months ago
- Code to accompany the Universal Deep Research paper (https://arxiv.org/abs/2509.00244)☆459Updated 4 months ago
- Train transformer language models with reinforcement learning.☆19Updated 10 months ago
- WeDLM: The fastest diffusion language model with standard causal attention and native KV cache compatibility, delivering real speedups ov…☆550Updated this week
- The State Of The Art, intelligence☆157Updated 5 months ago
- Deep research agents using MiniMax M2.1 interleaved thinking☆189Updated 2 weeks ago
- advanced, scalable, no-code RAG☆273Updated last week
- The official GitHub Page for MiniMax☆60Updated 2 months ago
- SimpleMem: Efficient Lifelong Memory for LLM Agents☆247Updated this week
- Developer Asset Hub for NVIDIA Nemotron — A one-stop resource for training recipes, usage cookbooks, and full end-to-end reference exampl…☆314Updated last week
- ☆22Updated last year
- Implementation of the MetaController proposed in "Emergent temporal abstractions in autoregressive models enable hierarchical reinforceme…☆84Updated this week
- ☆137Updated 7 months ago
- Official Project Page for Deep Delta Learning (https://huggingface.co/papers/2601.00417)☆282Updated last week
- Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B☆559Updated last month
- Collection of scripts and notebooks for OpenAI's latest GPT OSS models☆494Updated 4 months ago
- Clean, reusable paper implementations for trending papers on alphaXiv☆131Updated this week
- A CLI to estimate inference memory requirements for Hugging Face models, written in Python.☆168Updated this week
- An overview of GRPO & DeepSeek-R1 Training with Open Source GRPO Model Fine Tuning☆37Updated 7 months ago
- Official implementation of "Continuous Autoregressive Language Models"☆686Updated last month