xjdr-alt / muzero_sketchView external linksLinks
☆40Jul 26, 2024Updated last year
Alternatives and similar repositories for muzero_sketch
Users that are interested in muzero_sketch are comparing it to the libraries listed below
Sorting:
- Approximating the joint distribution of language models via MCTS☆22Nov 3, 2024Updated last year
- Simple Transformer in Jax☆142Jun 22, 2024Updated last year
- look how they massacred my boy☆63Oct 16, 2024Updated last year
- Plotting (entropy, varentropy) for small LMs☆99May 20, 2025Updated 8 months ago
- ☆21Feb 8, 2026Updated last week
- smolLM with Entropix sampler on pytorch☆149Oct 31, 2024Updated last year
- Modify Entropy Based Sampling to work with Mac Silicon via MLX☆49Nov 6, 2024Updated last year
- ☆33Nov 4, 2024Updated last year
- Fast, free, easy, and object-agnostic video anonymization☆11Dec 12, 2020Updated 5 years ago
- Companion repository to "Prompt Compression and Contrastive Conditioning for Controllability and Toxicity Reduction in Language Models"☆14May 31, 2023Updated 2 years ago
- High-performance tokenized language data-loader for Python C++ extension☆14Jul 22, 2024Updated last year
- Transformer with Mu-Parameterization, implemented in Jax/Flax. Supports FSDP on TPU pods.☆32Jun 5, 2025Updated 8 months ago
- Rust bindings for CTranslate2☆14Jun 21, 2023Updated 2 years ago
- LiteGPT: A 124M Small Language Model (SLM) pre-trained on FineWeb and fine-tuned on Alpaca.☆34Dec 16, 2025Updated 2 months ago
- Meme search engine for the real shitposters☆10Jan 27, 2026Updated 3 weeks ago
- supporting pytorch FSDP for optimizers☆84Dec 8, 2024Updated last year
- ☆34Aug 16, 2024Updated last year
- Quick Notebook Tutorials☆36Jul 17, 2025Updated 7 months ago
- FlexAttention w/ FlashAttention3 Support☆27Oct 5, 2024Updated last year
- ☆34Sep 10, 2024Updated last year
- ☆16Sep 27, 2023Updated 2 years ago
- MLX port for xjdr's entropix sampler (mimics jax implementation)☆61Nov 4, 2024Updated last year
- ☆12Jun 2, 2023Updated 2 years ago
- Training Models Daily☆16Dec 19, 2023Updated 2 years ago
- Agentic Deep Graph Reasoning Implementation☆14Mar 4, 2025Updated 11 months ago
- Benchmark tests supporting the TiledCUDA library.☆18Nov 19, 2024Updated last year
- Python library for building and sharing dataframe-agnostic, sklearn-style transformers and ml models for data science competitions.☆26Feb 10, 2026Updated last week
- Writing FLUX in Triton☆41Sep 22, 2024Updated last year
- NSA Triton Kernels written with GPT5 and Opus 4.1☆70Aug 12, 2025Updated 6 months ago
- A challenging aggregation benchmark for long-context models☆35Updated this week
- Minimal Implimentation of VCRec (2024) for collapse provention.☆18Jan 28, 2025Updated last year
- A `tree` util enhanced with tokens, lines, and components. `pip install -U tree_plus`☆15Nov 24, 2025Updated 2 months ago
- Ultra low overhead NVIDIA GPU telemetry plugin for telegraf with memory temperature readings.☆63Jul 8, 2024Updated last year
- Training hybrid models for dummies.☆29Nov 1, 2025Updated 3 months ago
- [ICLR 2026 🔥] Dr.LLM: Dynamic Layer Routing in LLMs☆41Oct 15, 2025Updated 4 months ago
- ☆19Dec 4, 2025Updated 2 months ago
- An experimental implementation of compiler-driven automatic sharding of models across a given device mesh.☆52Feb 10, 2026Updated last week
- ☆48Feb 23, 2025Updated 11 months ago
- An implementation of the Llama architecture, to instruct and delight☆21May 31, 2025Updated 8 months ago