nathan-barry / tiny-diffusionLinks
A character-level discrete diffusion transformer trained on Tiny Shakespeare
☆64Updated this week
Alternatives and similar repositories for tiny-diffusion
Users that are interested in tiny-diffusion are comparing it to the libraries listed below
Sorting:
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆107Updated 7 months ago
- Focused on fast experimentation and simplicity☆75Updated 10 months ago
- Lightweight package that tracks and summarizes code changes using LLMs (Large Language Models)☆34Updated 8 months ago
- implementation of https://arxiv.org/pdf/2312.09299☆21Updated last year
- H-Net Dynamic Hierarchical Architecture☆80Updated last month
- σ-GPT: A New Approach to Autoregressive Models☆68Updated last year
- DeMo: Decoupled Momentum Optimization☆194Updated 10 months ago
- NanoGPT-speedrunning for the poor T4 enjoyers☆72Updated 6 months ago
- https://x.com/BlinkDL_AI/status/1884768989743882276☆28Updated 5 months ago
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆102Updated 10 months ago
- ☆65Updated 7 months ago
- ☆102Updated 3 months ago
- Implementation of Mind Evolution, Evolving Deeper LLM Thinking, from Deepmind☆57Updated 5 months ago
- Cerule - A Tiny Mighty Vision Model☆67Updated last year
- Collection of autoregressive model implementation☆86Updated 6 months ago
- ☆24Updated 5 months ago
- look how they massacred my boy☆63Updated last year
- working implimention of deepseek MLA☆44Updated 9 months ago
- ☆81Updated last year
- RWKV-7: Surpassing GPT☆98Updated 11 months ago
- smolbox of recipies☆28Updated 6 months ago
- [WIP] Transformer to embed Danbooru labelsets☆13Updated last year
- Getting crystal-like representations with harmonic loss☆192Updated 6 months ago
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆29Updated this week
- ☆40Updated last year
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training☆132Updated last year
- open source alpha evolve☆66Updated 5 months ago
- Video+code lecture on building nanoGPT from scratch☆68Updated last year
- ☆30Updated last year
- Simple high-throughput inference library☆149Updated 5 months ago