srush / GPTWorldLinks

A puzzle to learn about prompting

☆132

Alternatives and similar repositories for GPTWorld

Users that are interested in GPTWorld are comparing it to the libraries listed below

Sorting:

srush / Transformer-Puzzles
Puzzles for exploring transformers
☆356Updated 2 years ago
gautierdag / bpeasy
Fast bare-bones BPE for modern tokenizer training
☆164Updated last month
normster / llm_rules
RuLES: a benchmark for evaluating rule-following in language models
☆228Updated 5 months ago
justinchiu / openlogprobs
Extract full next-token probabilities via language model APIs
☆247Updated last year
allenai / fm-cheatsheet
Website for hosting the Open Foundation Models Cheat Sheet.
☆267Updated 3 months ago
cloneofsimo / min-max-gpt
Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training
☆130Updated last year
xrsrke / pipegoose
Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*
☆86Updated last year
magicproduct / hash-hop
Long context evaluation for large language models
☆220Updated 5 months ago
llm-efficiency-challenge / neurips_llm_efficiency_challenge
NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day
☆256Updated last year
AblateIt / finetune-study
Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.
☆82Updated last year
LeonGuertler / UnstableBaselines
☆96Updated last week
HazyResearch / zoology
Understand and test language model architectures on synthetic tasks.
☆221Updated 3 weeks ago
r-three / git-theta
git extension for {collaborative, communal, continual} model development
☆217Updated 8 months ago
srush / raspy
An interactive exploration of Transformer programming.
☆267Updated last year
ayaka14732 / llama-2-jax
JAX implementation of the Llama 2 model
☆219Updated last year
dshah3 / GPU-Puzzles
Solve puzzles. Learn CUDA.
☆64Updated last year
SumanthRH / tokenization
A comprehensive deep dive into the world of tokens
☆225Updated last year
srush / Autodiff-Puzzles
☆442Updated 9 months ago
xjdr-alt / simple_transformer
Simple Transformer in Jax
☆138Updated last year
mcleish7 / arithmetic
Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)
☆190Updated last year
sangmichaelxie / cs324_p2
Project 2 (Building Large Language Models) for Stanford CS324: Understanding and Developing Large Language Models (Winter 2022)
☆105Updated 2 years ago
tysam-code / hlb-gpt
Minimalistic, extremely fast, and hackable researcher's toolbench for GPT models in 307 lines of code. Reaches <3.8 validation loss on wi…
☆350Updated last year
abacaj / train-with-fsdp
☆93Updated last year
mlfoundations / open_lm
A repository for research on medium sized language models.
☆510Updated 2 months ago
cloneofsimo / min-fsdp
☆83Updated last year
joey00072 / Tinytorch
A really tiny autograd engine
☆95Updated 2 months ago
google-deepmind / nanodo
☆275Updated last year
google-deepmind / mishax
☆136Updated 4 months ago
huggingface / llm-swarm
Manage scalable open LLM inference endpoints in Slurm clusters
☆268Updated last year
apple / ml-sigma-reparam
☆307Updated last year