JirkaKlimes / jit-implementation
🚀 JIT Implementation: Code That Writes Itself
☆100Updated last month
Related projects ⓘ
Alternatives and complementary repositories for jit-implementation
- Muon optimizer for neural networks: >30% extra sample efficiency, <3% wallclock overhead☆121Updated this week
- An introduction to LLM Sampling☆65Updated 2 weeks ago
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆113Updated 3 weeks ago
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆85Updated 2 months ago
- Tokun to can tokens☆15Updated last week
- Generate python documentation using LLMs☆57Updated 4 months ago
- ☆118Updated 3 months ago
- ☆43Updated 2 months ago
- An interactive HTML pretty-printer for machine learning research in IPython notebooks.☆338Updated this week
- ☆229Updated last month
- PyTorch implementation of models from the Zamba2 series.☆158Updated this week
- Reasoning Computers. Lambda Calculus, Fully Differentiable. Also Neural Stacks, Queues, Arrays, Lists, Trees, and Latches.☆235Updated 3 weeks ago
- run paligemma in real time☆123Updated 6 months ago
- A pure NumPy implementation of Mamba.☆216Updated 4 months ago
- A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and full…☆461Updated this week
- The AdEMAMix Optimizer: Better, Faster, Older.☆173Updated 2 months ago
- Embed arbitrary modalities (images, audio, documents, etc) into large language models.☆176Updated 7 months ago
- ☆112Updated this week
- look how they massacred my boy☆58Updated last month
- MLX port for xjdr's entropix sampler (mimics jax implementation)☆56Updated 2 weeks ago
- Our solution for the arc challenge 2024☆33Updated last week
- Scaling is a distributed training library and installable dependency designed to scale up neural networks, with a dedicated module for tr…☆51Updated 3 weeks ago
- AnyModal is a Flexible Multimodal Language Model Framework for PyTorch☆56Updated this week
- realtime latent world model inference demo☆35Updated last week
- σ-GPT: A New Approach to Autoregressive Models☆59Updated 3 months ago
- An efficent implementation of the method proposed in "The Era of 1-bit LLMs"☆154Updated last month
- A fast minimalistic implementation of guided generation on Apple Silicon using Outlines and MLX☆47Updated 9 months ago
- A pipeline parallel training script for LLMs.☆83Updated this week
- Alice in Wonderland code base for experiments and raw experiments data☆109Updated last month