openai/transformer-debugger

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/openai/transformer-debugger)

openai / transformer-debugger

☆4,122

Alternatives and similar repositories for transformer-debugger

Users that are interested in transformer-debugger are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

karpathy / minbpe
View on GitHub
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
☆10,656Jul 1, 2024Updated 2 years ago
Dao-AILab / flash-attention
View on GitHub
Fast and memory-efficient exact attention
☆24,591Updated this week
meta-pytorch / torchtune
View on GitHub
PyTorch native post-training library
☆5,795Updated this week
arcee-ai / mergekit
View on GitHub
Tools for merging pretrained large language models.
☆7,273Jun 17, 2026Updated last month
openai / weak-to-strong
View on GitHub
☆2,550May 19, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
mit-han-lab / streaming-llm
View on GitHub
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
☆7,252Jul 11, 2024Updated 2 years ago
huggingface / alignment-handbook
View on GitHub
Robust recipes to align language models with human and AI preferences
☆5,653May 26, 2026Updated 2 months ago
deepspeedai / DeepSpeed
View on GitHub
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
☆42,844Updated this week
huggingface / peft
View on GitHub
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
☆21,473Updated this week
huggingface / trl
View on GitHub
Train transformer language models with reinforcement learning.
☆18,976Updated this week
openai / automated-interpretability
View on GitHub
☆1,083Mar 6, 2024Updated 2 years ago
TransformerLensOrg / TransformerLens
View on GitHub
A library for mechanistic interpretability of GPT-style language models
☆3,737Updated this week
EleutherAI / lm-evaluation-harness
View on GitHub
A framework for few-shot evaluation of language models.
☆13,486Jul 13, 2026Updated 2 weeks ago
allenai / OLMo
View on GitHub
Modeling, training, eval, and inference code for OLMo
☆6,612Nov 24, 2025Updated 8 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
lm-sys / FastChat
View on GitHub
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
☆39,509May 1, 2026Updated 3 months ago
NVIDIA / Megatron-LM
View on GitHub
Ongoing research training transformer models at scale
☆17,286Updated this week
meta-pytorch / gpt-fast
View on GitHub
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
☆6,234Aug 22, 2025Updated 11 months ago
openai / simple-evals
View on GitHub
☆4,588Apr 22, 2026Updated 3 months ago
bitsandbytes-foundation / bitsandbytes
View on GitHub
Accessible large language models via k-bit quantization for PyTorch.
☆8,377Updated this week
huggingface / datatrove
View on GitHub
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
☆3,242Updated this week
LargeWorldModel / LWM
View on GitHub
Large World Model -- Modeling Text and Video with Millions Context
☆7,425Oct 19, 2024Updated last year
karpathy / llm.c
View on GitHub
LLM training in simple, raw C/CUDA
☆30,690Jun 26, 2025Updated last year
triton-lang / triton
View on GitHub
Development repository for the Triton language and compiler
☆19,830Updated this week
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
huggingface / text-generation-inference
View on GitHub
Large Language Model Text Generation Inference
☆10,887Mar 21, 2026Updated 4 months ago
sgl-project / sglang
View on GitHub
SGLang is a high-performance serving framework for large language models and multimodal models.
☆31,048Updated this week
microsoft / unilm
View on GitHub
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
☆22,175Jan 23, 2026Updated 6 months ago
OpenRLHF / OpenRLHF
View on GitHub
An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Asy…
☆9,868Jul 14, 2026Updated 2 weeks ago
stanfordnlp / dspy
View on GitHub
DSPy: The framework for programming—not prompting—language models
☆36,508Updated this week
vllm-project / vllm
View on GitHub
A high-throughput and memory-efficient inference and serving engine for LLMs
☆87,865Updated this week
openai / evals
View on GitHub
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
☆19,081Apr 14, 2026Updated 3 months ago
BlinkDL / RWKV-LM
View on GitHub
RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable)…
☆14,643Jul 23, 2026Updated last week
huggingface / nanotron
View on GitHub
Minimalistic large language model 3D-parallelism training
☆2,769May 26, 2026Updated 2 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
axolotl-ai-cloud / axolotl
View on GitHub
Go ahead and axolotl questions
☆12,291Updated this week
allenai / open-instruct
View on GitHub
AllenAI's post-training codebase
☆3,814Updated this week
pytorch / torchtitan
View on GitHub
A PyTorch native platform for training generative AI models
☆5,577Updated this week
haotian-liu / LLaVA
View on GitHub
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
☆24,961Aug 12, 2024Updated last year
artidoro / qlora
View on GitHub
QLoRA: Efficient Finetuning of Quantized LLMs
☆10,979Jun 10, 2024Updated 2 years ago
facebookresearch / xformers
View on GitHub
Hackable and optimized Transformers building blocks, supporting a composable construction.
☆10,529Jul 15, 2026Updated 2 weeks ago
tatsu-lab / stanford_alpaca
View on GitHub
Code and documentation to train Stanford's Alpaca models, and generate the data.
☆30,244Jul 17, 2024Updated 2 years ago