nathan-barry / tiny-diffusionLinks
A character-level language diffusion model trained on Tiny Shakespeare
☆790Updated last week
Alternatives and similar repositories for tiny-diffusion
Users that are interested in tiny-diffusion are comparing it to the libraries listed below
Sorting:
- Pytorch script hot swap: Change code without unloading your LLM from VRAM☆125Updated 8 months ago
- ☆460Updated last month
- ☆177Updated last month
- Live-bending a foundation model’s output at neural network level.☆271Updated 9 months ago
- ☆252Updated 10 months ago
- Hypernetworks that adapt LLMs for specific benchmark tasks using only textual task description as the input☆934Updated 6 months ago
- A reimplementation of Stable Diffusion 3.5 in pure PyTorch☆691Updated 6 months ago
- ☆47Updated 9 months ago
- A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and full…☆628Updated 9 months ago
- explore token trajectory trees on instruct and base models☆150Updated 7 months ago
- ☆250Updated last year
- ☆537Updated 5 months ago
- Pivotal Token Search☆142Updated 2 weeks ago
- RLHF (Supervised fine-tuning, reward model, and PPO) step-by-step in 3 Jupyter notebooks☆226Updated 6 months ago
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆345Updated last year
- Code for the Fractured Entangled Representation Hypothesis position paper!☆220Updated 2 months ago
- Code release for "LLMs can see and hear without any training"☆456Updated 8 months ago
- ☆214Updated 2 weeks ago
- GRPO training code which scales to 32xH100s for long horizon terminal/coding tasks. Base agent is now the top Qwen3 agent on Stanford's T…☆319Updated 4 months ago
- noise_step: Training in 1.58b With No Gradient Memory☆220Updated last year
- Visualizing the internal board state of a GPT trained on chess PGN strings, and performing interventions on its internal board state and …☆218Updated last year
- Super basic implementation (gist-like) of RLMs with REPL environments.☆293Updated 2 months ago
- A pure NumPy implementation of Mamba.☆222Updated last year
- Mistral7B playing DOOM☆138Updated last year
- rl from zero pretrain, can it be done? yes.☆286Updated 3 months ago
- Getting crystal-like representations with harmonic loss☆194Updated 9 months ago
- in this repository, i'm going to implement increasingly complex llm inference optimizations☆76Updated 7 months ago
- This repository contain the simple llama3 implementation in pure jax.☆70Updated 10 months ago
- Reverse Engineering Gemma 3n: Google's New Edge-Optimized Language Model☆256Updated 7 months ago
- Autograd to GPT-2 completely from scratch☆125Updated 4 months ago