huggingface / optimum-graphcore
Blazing fast training of π€ Transformers on Graphcore IPUs
β81Updated 6 months ago
Related projects: β
- β64Updated 2 years ago
- Easy and lightning fast training of π€ Transformers on Habana Gaudi processor (HPU)β144Updated this week
- Implementation of Flash Attention in Jaxβ188Updated 6 months ago
- Inference code for LLaMA models in JAXβ108Updated 4 months ago
- β172Updated this week
- β322Updated 5 months ago
- Implementation of a Transformer, but completely in Tritonβ242Updated 2 years ago
- JAX-Toolboxβ231Updated this week
- π Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flashβ¦β150Updated last week
- JAX implementation of the Llama 2 modelβ205Updated 7 months ago
- Training material for IPU users: tutorials, feature examples, simple applicationsβ86Updated last year
- β117Updated last week
- A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mindβ¦β145Updated this week
- β248Updated this week
- β56Updated 2 years ago
- This repository contains the experimental PyTorch native float8 training UXβ210Updated last month
- β234Updated last month
- DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.β163Updated 4 months ago
- jax-triton contains integrations between JAX and OpenAI Tritonβ328Updated this week
- Google TPU optimizations for transformers modelsβ62Updated this week
- β143Updated this week
- Train very large language models in Jax.β191Updated 10 months ago
- Amos optimizer with JEstimator lib.β79Updated 4 months ago
- Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways - in Jax (Equinox framework)β184Updated 2 years ago
- some common Huggingface transformers in maximal update parametrization (Β΅P)β76Updated 2 years ago
- Pax is a Jax-based machine learning framework for training large scale models. Pax allows for advanced and fully configurable experimentaβ¦β446Updated this week
- Various transformers for FSDP researchβ31Updated last year
- Swarm training framework using Haiku + JAX + Ray for layer parallel transformer language models on unreliable, heterogeneous nodesβ236Updated last year
- β61Updated 3 weeks ago
- Babysit your preemptible TPUsβ84Updated last year