apoorvkh / academic-pretrainingLinks
$100K or 100 Days: Trade-offs when Pre-Training with Academic Resources
โ139Updated 2 weeks ago
Alternatives and similar repositories for academic-pretraining
Users that are interested in academic-pretraining are comparing it to the libraries listed below
Sorting:
- The simplest, fastest repository for training/finetuning medium-sized GPTs.โ128Updated 3 weeks ago
- A MAD laboratory to improve AI architecture designs ๐งชโ116Updated 5 months ago
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT trainingโ127Updated last year
- โ131Updated 2 months ago
- WIPโ93Updated 9 months ago
- โ182Updated this week
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)โ190Updated last year
- Understand and test language model architectures on synthetic tasks.โ197Updated 2 months ago
- PyTorch library for Active Fine-Tuningโ77Updated 3 months ago
- โ58Updated 2 weeks ago
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)โ100Updated 2 months ago
- โ153Updated last month
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.โ172Updated 4 months ago
- Implementation of ๐ฅฅ Coconut, Chain of Continuous Thought, in Pytorchโ170Updated 5 months ago
- EvaByte: Efficient Byte-level Language Models at Scaleโ98Updated last month
- โ78Updated 11 months ago
- Open source interpretability artefacts for R1.โ140Updated last month
- โ51Updated last year
- Tiny re-implementation of MDM in style of LLaDA and nano-gpt speedrunโ52Updated 2 months ago
- An introduction to LLM Samplingโ78Updated 5 months ago
- Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models. TMLR 2025.โ61Updated 3 weeks ago
- Collection of autoregressive model implementationโ85Updated last month
- โ95Updated 4 months ago
- โ79Updated 9 months ago
- ICLR 2025 - official implementation for "I-Con: A Unifying Framework for Representation Learning"โ91Updated 3 weeks ago
- code for training & evaluating Contextual Document Embedding modelsโ191Updated 3 weeks ago
- supporting pytorch FSDP for optimizersโ79Updated 5 months ago
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"โ73Updated 7 months ago
- Efficiently discovering algorithms via LLMs with evolutionary search and reinforcement learning.โ98Updated last month
- โ80Updated last year