huggingface / optimum-graphcore
Blazing fast training of π€ Transformers on Graphcore IPUs
β85Updated 11 months ago
Alternatives and similar repositories for optimum-graphcore:
Users that are interested in optimum-graphcore are comparing it to the libraries listed below
- Inference code for LLaMA models in JAXβ116Updated 9 months ago
- β67Updated 2 years ago
- Implementation of Flash Attention in Jaxβ206Updated last year
- JAX implementation of the Llama 2 modelβ216Updated last year
- Easy and lightning fast training of π€ Transformers on Habana Gaudi processor (HPU)β174Updated this week
- β184Updated 2 weeks ago
- Implementation of a Transformer, but completely in Tritonβ259Updated 2 years ago
- Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways - in Jax (Equinox framework)β187Updated 2 years ago
- Google TPU optimizations for transformers modelsβ102Updated last month
- Train very large language models in Jax.β203Updated last year
- A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mindβ¦β154Updated 3 months ago
- JAX-Toolboxβ286Updated this week
- DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.β164Updated this week
- This repository contains the experimental PyTorch native float8 training UXβ221Updated 7 months ago
- jax-triton contains integrations between JAX and OpenAI Tritonβ382Updated this week
- Training material for IPU users: tutorials, feature examples, simple applicationsβ86Updated last year
- β246Updated 7 months ago
- GPTQ inference Triton kernelβ297Updated last year
- A tool to configure, launch and manage your machine learning experiments.β123Updated this week
- β288Updated this week
- β339Updated 10 months ago
- β58Updated 3 years ago
- π Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flashβ¦β226Updated this week
- β86Updated last year
- Swarm training framework using Haiku + JAX + Ray for layer parallel transformer language models on unreliable, heterogeneous nodesβ236Updated last year
- Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees"β362Updated last year
- JMP is a Mixed Precision library for JAX.β193Updated last month
- β186Updated 2 weeks ago
- DiffQ performs differentiable quantization using pseudo quantization noise. It can automatically tune the number of bits used per weight β¦β235Updated last year
- Various transformers for FSDP researchβ37Updated 2 years ago