huggingface / optimum-graphcoreLinks
Blazing fast training of ๐ค Transformers on Graphcore IPUs
โ87Updated last year
Alternatives and similar repositories for optimum-graphcore
Users that are interested in optimum-graphcore are comparing it to the libraries listed below
Sorting:
- โ66Updated 3 years ago
- DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.โ171Updated 4 months ago
- Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways - in Jax (Equinox framework)โ190Updated 3 years ago
- Implementation of Flash Attention in Jaxโ224Updated last year
- Large scale 4D parallelism pre-training for ๐ค transformers in Mixture of Experts *(still work in progress)*โ86Updated 2 years ago
- JAX implementation of the Llama 2 modelโ216Updated last year
- Easy and lightning fast training of ๐ค Transformers on Habana Gaudi processor (HPU)โ205Updated last week
- โ192Updated last week
- โ63Updated 3 years ago
- โ366Updated last year
- Implementation of a Transformer, but completely in Tritonโ278Updated 3 years ago
- Inference code for LLaMA models in JAXโ120Updated last year
- [WIP] A ๐ฅ interface for running code in the cloudโ86Updated 2 years ago
- Minimal library to train LLMs on TPU in JAX with pjit().โ301Updated 2 years ago
- some common Huggingface transformers in maximal update parametrization (ยตP)โ87Updated 3 years ago
- Pipeline for pulling and processing online language model pretraining data from the webโ177Updated 2 years ago
- โ20Updated 3 years ago
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Dayโ260Updated 2 years ago
- Train very large language models in Jax.โ210Updated 2 years ago
- Google TPU optimizations for transformers modelsโ133Updated this week
- โ131Updated 3 years ago
- โ252Updated last year
- Used for adaptive human in the loop evaluation of language and embedding models.โ308Updated 2 years ago
- OSLO: Open Source for Large-scale Optimizationโ175Updated 2 years ago
- Techniques used to run BLOOM at inference in parallelโ37Updated 3 years ago
- git extension for {collaborative, communal, continual} model developmentโ217Updated last year
- Language Modeling with the H3 State Space Modelโ522Updated 2 years ago
- Swarm training framework using Haiku + JAX + Ray for layer parallel transformer language models on unreliable, heterogeneous nodesโ242Updated 2 years ago
- โ151Updated 3 weeks ago
- Training material for IPU users: tutorials, feature examples, simple applicationsโ87Updated 2 years ago