π Accelerate inference and training of π€ Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization tools
β3,348Apr 2, 2026Updated last week
Alternatives and similar repositories for optimum
Users that are interested in optimum are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β9,596Apr 2, 2026Updated last week
- π€ Evaluate: A library for easily evaluating machine learning models and datasets.β2,437Apr 2, 2026Updated last week
- Large Language Model Text Generation Inferenceβ10,817Mar 21, 2026Updated 3 weeks ago
- Accessible large language models via k-bit quantization for PyTorch.β8,107Updated this week
- π€ PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.β20,895Apr 2, 2026Updated last week
- Virtual machines for every use case on DigitalOcean β’ AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Efficient, scalable and enterprise-grade CPU/GPU inference server for π€ Hugging Face transformer models πβ1,688Oct 23, 2024Updated last year
- Transformer related optimization, including BERT, GPTβ6,410Mar 27, 2024Updated 2 years ago
- Simple, safe way to store and distribute tensorsβ3,678Apr 2, 2026Updated last week
- Fast and memory-efficient exact attentionβ23,185Updated this week
- π€ The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation toolsβ21,374Updated this week
- π€ Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.β33,282Updated this week
- π₯ Fast State-of-the-Art Tokenizers optimized for Research and Productionβ10,597Apr 2, 2026Updated last week
- Train transformer language models with reinforcement learning.β17,967Updated this week
- Hackable and optimized Transformers building blocks, supporting a composable construction.β10,411Mar 30, 2026Updated last week
- End-to-end encrypted email - Proton Mail β’ AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.β2,107Jun 30, 2025Updated 9 months ago
- Efficient few-shot learning with Sentence Transformersβ2,710Apr 2, 2026Updated last week
- A pytorch quantization backend for optimumβ1,035Apr 2, 2026Updated last week
- π€ Optimum Intel: Accelerate inference with Intel optimization toolsβ561Apr 2, 2026Updated last week
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.β41,977Apr 3, 2026Updated last week
- TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizatβ¦β13,304Updated this week
- The Triton Inference Server provides an optimized cloud and edge inferencing solution.β10,533Updated this week
- An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.β5,042Apr 11, 2025Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMsβ75,637Updated this week
- DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A blazing fast inference solution for text embeddings modelsβ4,663Updated this week
- SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, β¦β2,612Updated this week
- ONNX Runtime: cross-platform, high performance ML inferencing and training acceleratorβ19,779Updated this week
- Development repository for the Triton language and compilerβ18,840Apr 4, 2026Updated last week
- PyTorch extensions for high performance and large scale training.β3,404Apr 26, 2025Updated 11 months ago
- PyTorch native quantization and sparsity for training and inferenceβ2,756Apr 4, 2026Updated last week
- State-of-the-Art Text Embeddingsβ18,494Apr 2, 2026Updated last week
- Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackabβ¦β1,585Jan 28, 2026Updated 2 months ago
- Foundation Architecture for (M)LLMsβ3,133Apr 11, 2024Updated 2 years ago
- DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.β2,978Apr 2, 2026Updated last week
- A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Autoβ¦β17,048Updated this week
- Fast inference engine for Transformer modelsβ4,417Feb 4, 2026Updated 2 months ago
- A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on Hβ¦β3,256Apr 3, 2026Updated last week
- ποΈ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Oβ¦β334Apr 3, 2026Updated last week
- Minimalistic large language model 3D-parallelism trainingβ2,644Updated this week
- SGLang is a high-performance serving framework for large language models and multimodal models.β25,408Apr 4, 2026Updated last week