nathan-barry / tiny-diffusionLinks
A character-level language diffusion model trained on Tiny Shakespeare
☆842Updated last week
Alternatives and similar repositories for tiny-diffusion
Users that are interested in tiny-diffusion are comparing it to the libraries listed below
Sorting:
- Pytorch script hot swap: Change code without unloading your LLM from VRAM☆125Updated 9 months ago
- ☆463Updated 2 months ago
- A reimplementation of Stable Diffusion 3.5 in pure PyTorch☆690Updated 7 months ago
- Hypernetworks that adapt LLMs for specific benchmark tasks using only textual task description as the input☆938Updated 7 months ago
- ☆256Updated 10 months ago
- rl from zero pretrain, can it be done? yes.☆286Updated 4 months ago
- Flux 2 image generation model pure C inference☆1,395Updated this week
- explore token trajectory trees on instruct and base models☆150Updated 8 months ago
- Live-bending a foundation model’s output at neural network level.☆272Updated 9 months ago
- ☆214Updated last week
- ☆178Updated last month
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆345Updated last year
- noise_step: Training in 1.58b With No Gradient Memory☆220Updated last year
- ☆250Updated last year
- Simple & Scalable Pretraining for Neural Architecture Research☆307Updated last month
- Reverse Engineering Gemma 3n: Google's New Edge-Optimized Language Model☆262Updated 8 months ago
- Code for the Fractured Entangled Representation Hypothesis position paper!☆221Updated 2 months ago
- Getting crystal-like representations with harmonic loss☆195Updated 9 months ago
- Build your own visual reasoning model☆417Updated 2 weeks ago
- ~950 line, minimal, extensible LLM inference engine built from scratch.☆396Updated 3 weeks ago
- A pure NumPy implementation of Mamba.☆222Updated last year
- Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"☆584Updated 3 months ago
- Dion optimizer algorithm☆420Updated last week
- Open-source release accompanying Gao et al. 2025☆498Updated last month
- Visualizing the internal board state of a GPT trained on chess PGN strings, and performing interventions on its internal board state and …☆218Updated last year
- GRPO training code which scales to 32xH100s for long horizon terminal/coding tasks. Base agent is now the top Qwen3 agent on Stanford's T…☆342Updated 5 months ago
- ☆540Updated 5 months ago
- A tiny autograd engine with a Jax-like API☆74Updated 6 months ago
- Diffusion on syntax trees for program synthesis☆480Updated last year
- Autograd to GPT-2 completely from scratch☆126Updated 5 months ago