π Accelerate inference and training of π€ Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization tools
β3,409Jun 3, 2026Updated last week
Alternatives and similar repositories for optimum
Users that are interested in optimum are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β9,711Jun 2, 2026Updated last week
- π€ Evaluate: A library for easily evaluating machine learning models and datasets.β2,452May 26, 2026Updated 2 weeks ago
- Large Language Model Text Generation Inferenceβ10,859Mar 21, 2026Updated 2 months ago
- Accessible large language models via k-bit quantization for PyTorch.β8,258Updated this week
- π€ PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.β21,226Jun 1, 2026Updated last week
- Deploy to Railway using AI coding agents - Free Credits Offer β’ AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Efficient, scalable and enterprise-grade CPU/GPU inference server for π€ Hugging Face transformer models πβ1,685Oct 23, 2024Updated last year
- Transformer related optimization, including BERT, GPTβ6,421Mar 27, 2024Updated 2 years ago
- Simple, safe way to store and distribute tensorsβ3,764Updated this week
- π€ The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation toolsβ21,575Jun 3, 2026Updated last week
- Fast and memory-efficient exact attentionβ24,037Jun 3, 2026Updated last week
- π€ Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.β33,775Updated this week
- π₯ Fast State-of-the-Art Tokenizers optimized for Research and Productionβ10,806Updated this week
- Train transformer language models with reinforcement learning.β18,547Updated this week
- Hackable and optimized Transformers building blocks, supporting a composable construction.β10,484May 21, 2026Updated 2 weeks ago
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Efficient few-shot learning with Sentence Transformersβ2,743May 26, 2026Updated 2 weeks ago
- MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.β2,107Jun 30, 2025Updated 11 months ago
- A pytorch quantization backend for optimumβ1,042Updated this week
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.β42,478Updated this week
- π€ Optimum Intel: Accelerate inference with Intel optimization toolsβ593Jun 3, 2026Updated last week
- TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizatβ¦β13,793Jun 3, 2026Updated last week
- The Triton Inference Server provides an optimized cloud and edge inferencing solution.β10,733Updated this week
- An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.β5,068Apr 11, 2025Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMsβ81,909Updated this week
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, β¦β2,652Updated this week
- A blazing fast inference solution for text embeddings modelsβ4,840May 26, 2026Updated 2 weeks ago
- ONNX Runtime: cross-platform, high performance ML inferencing and training acceleratorβ20,718Jun 4, 2026Updated last week
- Development repository for the Triton language and compilerβ19,380Updated this week
- PyTorch extensions for high performance and large scale training.β3,407Apr 26, 2025Updated last year
- PyTorch native quantization and sparsity for training and inferenceβ2,847Updated this week
- State-of-the-Art Embeddings, Retrieval, and Rerankingβ18,780Updated this week
- Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackabβ¦β1,586Jan 28, 2026Updated 4 months ago
- Foundation Architecture for (M)LLMsβ3,132Apr 11, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer β’ AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Autoβ¦β17,292Jun 3, 2026Updated last week
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.β3,077May 26, 2026Updated 2 weeks ago
- Fast inference engine for Transformer modelsβ4,509Jun 3, 2026Updated last week
- A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on Hβ¦β3,381Updated this week
- Minimalistic large language model 3D-parallelism trainingβ2,711May 26, 2026Updated 2 weeks ago
- ποΈ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Oβ¦β337May 26, 2026Updated 2 weeks ago
- SGLang is a high-performance serving framework for large language models and multimodal models.β28,886Updated this week