hundredblocks / large-model-parallelism
Functional local implementations of main model parallelism approaches
☆95Updated last year
Related projects ⓘ
Alternatives and complementary repositories for large-model-parallelism
- ☆91Updated last year
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆81Updated last year
- Textbook on reinforcement learning from human feedback☆74Updated last week
- A puzzle to learn about prompting☆119Updated last year
- ☆72Updated 4 months ago
- JAX implementation of the Llama 2 model☆210Updated 9 months ago
- Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*☆80Updated 10 months ago
- Inference code for LLaMA models in JAX☆112Updated 5 months ago
- Exploring finetuning public checkpoints on filter 8K sequences on Pile☆115Updated last year
- A MAD laboratory to improve AI architecture designs 🧪☆95Updated 6 months ago
- Experiments with generating opensource language model assistants☆97Updated last year
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆83Updated last week
- ☆64Updated 2 years ago
- Scaling is a distributed training library and installable dependency designed to scale up neural networks, with a dedicated module for tr…☆46Updated last week
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training☆112Updated 6 months ago
- An interactive exploration of Transformer programming.☆245Updated 11 months ago
- ☆22Updated last year
- git extension for {collaborative, communal, continual} model development☆205Updated 5 months ago
- Train very large language models in Jax.☆195Updated last year
- code for training & evaluating Contextual Document Embedding models☆92Updated this week
- ☆49Updated 7 months ago
- Experiments for efforts to train a new and improved t5☆76Updated 6 months ago
- Resources from the EleutherAI Math Reading Group☆50Updated last month
- Erasing concepts from neural representations with provable guarantees☆208Updated 3 weeks ago
- Project 2 (Building Large Language Models) for Stanford CS324: Understanding and Developing Large Language Models (Winter 2022)☆101Updated last year
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆28Updated last month
- ☆99Updated 3 months ago
- Used for adaptive human in the loop evaluation of language and embedding models.☆303Updated last year
- Training and Inference Notebooks for the RedPajama (OpenLlama) models☆18Updated last year