☆40Jul 26, 2024Updated last year
Alternatives and similar repositories for muzero_sketch
Users that are interested in muzero_sketch are comparing it to the libraries listed below
Sorting:
- Simple Transformer in Jax☆143Jun 22, 2024Updated last year
- look how they massacred my boy☆63Oct 16, 2024Updated last year
- Plotting (entropy, varentropy) for small LMs☆99May 20, 2025Updated 9 months ago
- ☆21Updated this week
- smolLM with Entropix sampler on pytorch☆149Oct 31, 2024Updated last year
- Modify Entropy Based Sampling to work with Mac Silicon via MLX☆49Nov 6, 2024Updated last year
- ☆33Nov 4, 2024Updated last year
- Fast, free, easy, and object-agnostic video anonymization☆11Dec 12, 2020Updated 5 years ago
- Companion repository to "Prompt Compression and Contrastive Conditioning for Controllability and Toxicity Reduction in Language Models"☆14May 31, 2023Updated 2 years ago
- LiteGPT: A 124M Small Language Model (SLM) pre-trained on FineWeb and fine-tuned on Alpaca.☆34Dec 16, 2025Updated 2 months ago
- Transformer with Mu-Parameterization, implemented in Jax/Flax. Supports FSDP on TPU pods.☆32Jun 5, 2025Updated 9 months ago
- ☆12Jan 4, 2024Updated 2 years ago
- Rust bindings for CTranslate2☆14Jun 21, 2023Updated 2 years ago
- Meme search engine for the real shitposters☆10Jan 27, 2026Updated last month
- ☆35Aug 16, 2024Updated last year
- FlexAttention w/ FlashAttention3 Support☆27Oct 5, 2024Updated last year
- Quick Notebook Tutorials☆36Jul 17, 2025Updated 7 months ago
- ☆34Sep 10, 2024Updated last year
- ☆16Sep 27, 2023Updated 2 years ago
- MLX port for xjdr's entropix sampler (mimics jax implementation)☆62Nov 4, 2024Updated last year
- Training Models Daily☆16Dec 19, 2023Updated 2 years ago
- ☆12Jun 2, 2023Updated 2 years ago
- Agentic Deep Graph Reasoning Implementation☆14Mar 4, 2025Updated last year
- Writing FLUX in Triton☆42Sep 22, 2024Updated last year
- NSA Triton Kernels written with GPT5 and Opus 4.1☆70Aug 12, 2025Updated 6 months ago
- Minimal Implimentation of VCRec (2024) for collapse provention.☆18Jan 28, 2025Updated last year
- A challenging aggregation benchmark for long-context models☆38Feb 22, 2026Updated 2 weeks ago
- A `tree` util enhanced with tokens, lines, and components. `pip install -U tree_plus`☆15Nov 24, 2025Updated 3 months ago
- Ultra low overhead NVIDIA GPU telemetry plugin for telegraf with memory temperature readings.☆63Jul 8, 2024Updated last year
- Training hybrid models for dummies.☆29Nov 1, 2025Updated 4 months ago
- ☆115Dec 1, 2024Updated last year
- An experimental implementation of compiler-driven automatic sharding of models across a given device mesh.☆52Updated this week
- ☆48Feb 23, 2025Updated last year
- An implementation of the Llama architecture, to instruct and delight☆21May 31, 2025Updated 9 months ago
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training☆132Apr 17, 2024Updated last year
- RAG Agent for the ARC AGI Challenge☆20Jul 1, 2024Updated last year
- ☆193Feb 27, 2026Updated last week
- Entropy Based Sampling and Parallel CoT Decoding☆3,432Nov 13, 2024Updated last year
- ☆24Dec 26, 2023Updated 2 years ago