Cerebras / gigaGPTView external linksLinks
a small code base for training large models
☆322Apr 28, 2025Updated 9 months ago
Alternatives and similar repositories for gigaGPT
Users that are interested in gigaGPT are comparing it to the libraries listed below
Sorting:
- [WIP] Transformer to embed Danbooru labelsets☆13Mar 31, 2024Updated last year
- Minimal Implimentation of VCRec (2024) for collapse provention.☆18Jan 28, 2025Updated last year
- RWKV in nanoGPT style☆196Jun 9, 2024Updated last year
- Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.☆6,183Aug 22, 2025Updated 5 months ago
- Port of Andrej Karpathy's nanoGPT to Apple MLX framework.☆117Feb 12, 2024Updated 2 years ago
- ☆14Oct 4, 2024Updated last year
- ☆22Jan 27, 2026Updated 2 weeks ago
- Official Implementation of NeurIPS'23 Paper "Cross-Episodic Curriculum for Transformer Agents"☆31Oct 12, 2023Updated 2 years ago
- Minimalistic large language model 3D-parallelism training☆2,544Dec 11, 2025Updated 2 months ago
- Minimalistic, hackable PyTorch implementation of SimSiam in ~400 lines. Achieves good performance on ImageNet with ResNet50. Features dis…☆21Nov 25, 2024Updated last year
- Create synthetic datasets from scratch using AI-powered generation. Define topics, customize prompts, and generate high-quality reasoning…☆29Updated this week
- NanoGPT-speedrunning for the poor T4 enjoyers☆73Apr 22, 2025Updated 9 months ago
- ☆71Jul 11, 2024Updated last year
- An introduction to DSPy☆33Aug 30, 2025Updated 5 months ago
- ☆15Dec 22, 2023Updated 2 years ago
- Fast modular code to create and train cutting edge LLMs☆68May 16, 2024Updated last year
- Jax like function transformation engine but micro, microjax☆34Oct 25, 2024Updated last year
- A collection of reproducible inference engine benchmarks☆38Apr 22, 2025Updated 9 months ago
- never forget anything again! combine AI and intelligent tooling for a local knowledge base to track catalogue, annotate, and plan for you…☆37May 14, 2024Updated last year
- Common tools for data processing☆22Dec 8, 2025Updated 2 months ago
- Tools for merging pretrained large language models.☆6,783Jan 26, 2026Updated 2 weeks ago
- ☆234Nov 24, 2025Updated 2 months ago
- Make your terminal string output even more beautiful 💖☆11Jul 27, 2025Updated 6 months ago
- ☆63Sep 23, 2024Updated last year
- A comprehensive repository of reasoning tasks for LLMs (and beyond)☆458Sep 27, 2024Updated last year
- ☆1,118Jan 1, 2026Updated last month
- Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"☆248Jun 6, 2025Updated 8 months ago
- NanoGPT (124M) in 2 minutes☆4,624Updated this week
- ⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Pl…☆2,174Oct 8, 2024Updated last year
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.☆13,155Feb 8, 2026Updated last week
- Code of the Paper "Time-Efficient Reinforcement Learning with Stochastic Stateful Policies"☆25May 5, 2024Updated last year
- Scaling Data-Constrained Language Models☆341Jun 28, 2025Updated 7 months ago
- ☆42Jun 19, 2024Updated last year
- Minimalistic 4D-parallelism distributed training framework for education purpose☆2,076Aug 26, 2025Updated 5 months ago
- Official implementation of Half-Quadratic Quantization (HQQ)☆913Dec 18, 2025Updated last month
- Luth is a state-of-the-art series of fine-tuned LLMs for French☆41Oct 12, 2025Updated 4 months ago
- QLoRA for Masked Language Modeling☆22Sep 11, 2023Updated 2 years ago
- Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".☆224Dec 16, 2025Updated last month
- Simple repository for training small reasoning models☆49Feb 6, 2025Updated last year