hundredblocks / large-model-parallelism
Functional local implementations of main model parallelism approaches
☆95Updated last year
Related projects ⓘ
Alternatives and complementary repositories for large-model-parallelism
- A puzzle to learn about prompting☆121Updated last year
- Train very large language models in Jax.☆195Updated last year
- ☆161Updated last year
- ☆57Updated 11 months ago
- Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*☆80Updated 11 months ago
- Proof-of-concept of global switching between numpy/jax/pytorch in a library.☆18Updated 5 months ago
- git extension for {collaborative, communal, continual} model development☆205Updated last week
- An interactive exploration of Transformer programming.☆246Updated last year
- JAX implementation of the Llama 2 model☆210Updated 9 months ago
- Supercharge huggingface transformers with model parallelism.☆75Updated last month
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆81Updated last year
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training☆113Updated 7 months ago
- A MAD laboratory to improve AI architecture designs 🧪☆95Updated 6 months ago
- A place to store reusable transformer components of my own creation or found on the interwebs☆44Updated 2 weeks ago
- Resources from the EleutherAI Math Reading Group☆51Updated last month
- some common Huggingface transformers in maximal update parametrization (µP)☆76Updated 2 years ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆84Updated this week
- QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning P…☆34Updated last year
- ☆53Updated 10 months ago
- ML/DL Math and Method notes☆57Updated 11 months ago
- Experiments with generating opensource language model assistants☆97Updated last year
- Inference code for LLaMA models in JAX☆113Updated 6 months ago
- Exploring finetuning public checkpoints on filter 8K sequences on Pile☆115Updated last year
- ☆64Updated 2 years ago
- ☆49Updated 8 months ago
- Project 2 (Building Large Language Models) for Stanford CS324: Understanding and Developing Large Language Models (Winter 2022)☆101Updated last year
- ☆73Updated 4 months ago
- ☆20Updated last year
- ☆22Updated last year