awni / picochatLinks
Smaller and faster nanochat in MLX
☆32Updated last month
Alternatives and similar repositories for picochat
Users that are interested in picochat are comparing it to the libraries listed below
Sorting:
- A collection of lightweight interpretability scripts to understand how LLMs think☆71Updated this week
- Benchmarks comparing PyTorch and MLX on Apple Silicon GPUs☆92Updated last year
- OLMost every training recipe you need to perform data interventions with the OLMo family of models.☆59Updated this week
- ☆45Updated 2 months ago
- Because it's there.☆16Updated last year
- A collection of optimizers for MLX☆54Updated last week
- NanoGPT-speedrunning for the poor T4 enjoyers☆73Updated 7 months ago
- A simple MLX implementation for pretraining LLMs on Apple Silicon.☆84Updated 4 months ago
- Pixel Parsing. A reproduction of OCR-free end-to-end document understanding models with open data☆23Updated last year
- Seemless interface of using PyTOrch distributed with Jupyter notebooks☆57Updated 3 months ago
- Exploration into the Firefly algorithm in Pytorch☆41Updated 10 months ago
- webgpu autograd library☆33Updated 6 months ago
- Collection of autoregressive model implementation☆85Updated 7 months ago
- Samples of good AI generated CUDA kernels☆94Updated 6 months ago
- Cerule - A Tiny Mighty Vision Model☆68Updated last month
- The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…☆121Updated 2 months ago
- Implementation of nougat that focuses on processing pdf locally.☆83Updated 11 months ago
- ☆59Updated last year
- Train, tune, and infer Bamba model☆137Updated 6 months ago
- EXO Gym is an open-source Python toolkit that facilitates distributed AI research.☆87Updated 2 weeks ago
- mlx image models for Apple Silicon machines☆88Updated 3 weeks ago
- Rust Implementation of micrograd☆53Updated last year
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆103Updated 11 months ago
- Lego for GRPO☆30Updated 6 months ago
- 👷 Build compute kernels☆193Updated this week
- LLM training in simple, raw C/CUDA☆15Updated last year
- ☆27Updated last year
- alternative way to calculating self attention☆18Updated last year
- ☆28Updated last year
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆108Updated 9 months ago