hundredblocks / large-model-parallelismLinks
Functional local implementations of main model parallelism approaches
โ96Updated 2 years ago
Alternatives and similar repositories for large-model-parallelism
Users that are interested in large-model-parallelism are comparing it to the libraries listed below
Sorting:
- A puzzle to learn about promptingโ135Updated 2 years ago
- Large scale 4D parallelism pre-training for ๐ค transformers in Mixture of Experts *(still work in progress)*โ87Updated last year
- Train very large language models in Jax.โ208Updated last year
- An interactive exploration of Transformer programming.โ269Updated last year
- โ94Updated last year
- git extension for {collaborative, communal, continual} model developmentโ216Updated 10 months ago
- JAX implementation of the Llama 2 modelโ219Updated last year
- gzip Predicts Data-dependent Scaling Lawsโ35Updated last year
- A Jax-based library for building transformers, includes implementations of GPT, Gemma, LlaMa, Mixtral, Whisper, SWin, ViT and more.โ291Updated last year
- โ62Updated last year
- HomebrewNLP in JAX flavour for maintable TPU-Trainingโ50Updated last year
- A case study of efficient training of large language models using commodity hardware.โ68Updated 3 years ago
- โ166Updated 2 years ago
- some common Huggingface transformers in maximal update parametrization (ยตP)โ82Updated 3 years ago
- Automatically take good care of your preemptible TPUsโ36Updated 2 years ago
- โ69Updated last year
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT trainingโ132Updated last year
- Inference code for LLaMA models in JAXโ119Updated last year
- ML/DL Math and Method notesโ63Updated last year
- Automatic gradient descentโ210Updated 2 years ago
- A place to store reusable transformer components of my own creation or found on the interwebsโ60Updated 3 weeks ago
- โ21Updated last year
- Amos optimizer with JEstimator lib.โ82Updated last year
- Proof-of-concept of global switching between numpy/jax/pytorch in a library.โ18Updated last year
- Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways - in Jax (Equinox framework)โ188Updated 3 years ago
- Resources from the EleutherAI Math Reading Groupโ54Updated 6 months ago
- Project 2 (Building Large Language Models) for Stanford CS324: Understanding and Developing Large Language Models (Winter 2022)โ105Updated 2 years ago
- โ53Updated last year
- Used for adaptive human in the loop evaluation of language and embedding models.โ311Updated 2 years ago
- โ144Updated 2 years ago