huggingface / optimum-graphcoreLinks
Blazing fast training of π€ Transformers on Graphcore IPUs
β85Updated last year
Alternatives and similar repositories for optimum-graphcore
Users that are interested in optimum-graphcore are comparing it to the libraries listed below
Sorting:
- β66Updated 3 years ago
- DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.β171Updated 2 months ago
- Implementation of Flash Attention in Jaxβ223Updated last year
- JAX implementation of the Llama 2 modelβ216Updated last year
- Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways - in Jax (Equinox framework)β189Updated 3 years ago
- Easy and lightning fast training of π€ Transformers on Habana Gaudi processor (HPU)β201Updated last week
- Inference code for LLaMA models in JAXβ120Updated last year
- β190Updated 3 weeks ago
- Large scale 4D parallelism pre-training for π€ transformers in Mixture of Experts *(still work in progress)*β87Updated last year
- β62Updated 3 years ago
- [WIP] A π₯ interface for running code in the cloudβ86Updated 2 years ago
- Train very large language models in Jax.β210Updated 2 years ago
- β19Updated 3 years ago
- Training material for IPU users: tutorials, feature examples, simple applicationsβ87Updated 2 years ago