srush / GPTWorld
A puzzle to learn about prompting
☆121Updated last year
Related projects ⓘ
Alternatives and complementary repositories for GPTWorld
- Puzzles for exploring transformers☆325Updated last year
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆84Updated last week
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training☆113Updated 7 months ago
- JAX implementation of the Llama 2 model☆210Updated 9 months ago
- Fast bare-bones BPE for modern tokenizer training☆142Updated last month
- ☆101Updated 3 months ago
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)☆178Updated 5 months ago
- RuLES: a benchmark for evaluating rule-following in language models☆211Updated last month
- Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*☆80Updated 11 months ago
- ☆73Updated 4 months ago
- $100K or 100 Days: Trade-offs when Pre-Training with Academic Resources☆95Updated 2 weeks ago
- Erasing concepts from neural representations with provable guarantees☆209Updated last week
- A MAD laboratory to improve AI architecture designs 🧪☆95Updated 6 months ago
- Extract full next-token probabilities via language model APIs☆229Updated 8 months ago
- ☆197Updated 4 months ago
- ☆161Updated last year
- Resources from the EleutherAI Math Reading Group☆51Updated last month
- Scaling is a distributed training library and installable dependency designed to scale up neural networks, with a dedicated module for tr…☆49Updated 3 weeks ago
- MiniHF is an inference, human preference data collection, and fine-tuning tool for local language models. It is intended to help the user…☆151Updated this week
- Solve puzzles. Learn CUDA.☆61Updated 11 months ago
- Cost aware hyperparameter tuning algorithm☆123Updated 4 months ago
- ☆391Updated last month
- Simple Transformer in Jax☆119Updated 4 months ago
- seqax = sequence modeling + JAX☆133Updated 4 months ago
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆81Updated last year
- Website for hosting the Open Foundation Models Cheat Sheet.☆257Updated 4 months ago
- Resources for skilling up in AI alignment research engineering. Covers basics of deep learning, mechanistic interpretability, and RL.☆200Updated 9 months ago
- git extension for {collaborative, communal, continual} model development☆205Updated this week
- An interactive exploration of Transformer programming.☆246Updated last year
- code for training & evaluating Contextual Document Embedding models☆117Updated this week