xjdr-alt / simple_transformer
Simple Transformer in Jax
☆100Updated 2 months ago
Related projects: ⓘ
- Aidan Bench attempts to measure <big_model_smell> in LLMs.☆64Updated this week
- GPT-2 (124M) quality in 5B tokens☆227Updated last week
- The history files when recording human interaction while solving ARC tasks☆91Updated this week
- Just large language models. Hackable, with as little abstraction as possible. Done for my own purposes, feel free to rip.☆42Updated last year
- MiniHF is an inference, human preference data collection, and fine-tuning tool for local language models. It is intended to help the user…☆143Updated this week
- ☆27Updated 2 months ago
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆81Updated last year
- A really tiny autograd engine☆85Updated 5 months ago
- Sparse autoencoders☆297Updated last week
- A comprehensive repository of reasoning tasks for LLMs (and beyond)☆260Updated last month
- Helpers and such for working with Lambda Cloud☆51Updated 10 months ago
- A puzzle to learn about prompting☆106Updated last year
- Long context evaluation for large language models☆148Updated this week
- Just a bunch of benchmark logs for different LLMs☆112Updated last month
- Draw more samples☆159Updated 2 months ago
- ☆89Updated 11 months ago
- Extract full next-token probabilities via language model APIs☆226Updated 6 months ago
- Full finetuning of large language models without large memory requirements☆94Updated 8 months ago
- seqax = sequence modeling + JAX☆129Updated 2 months ago
- Simplex Random Feature attention, in PyTorch☆71Updated 11 months ago
- ☆24Updated 5 months ago
- ☆48Updated 11 months ago
- ☆68Updated 2 months ago
- ☆97Updated 5 months ago
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (2024)☆169Updated 3 months ago
- An automated tool for discovering insights from research papaer corpora☆131Updated 3 months ago
- Port of Andrej Karpathy's nanoGPT to Apple MLX framework.☆96Updated 7 months ago
- Following master Karpathy with GPT-2 implementation and training, writing lots of comments cause I have memory of a goldfish☆165Updated last month
- Turing machines, Rule 110, and A::B reversal using Claude 3 Opus.☆60Updated 4 months ago
- Fast bare-bones BPE for modern tokenizer training☆138Updated 3 weeks ago