π Accelerate inference and training of π€ Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization tools
β3,402May 29, 2026Updated last week
Alternatives and similar repositories for optimum
Users that are interested in optimum are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β9,711Updated this week
- π€ Evaluate: A library for easily evaluating machine learning models and datasets.β2,452May 26, 2026Updated last week
- Large Language Model Text Generation Inferenceβ10,857Mar 21, 2026Updated 2 months ago
- Accessible large language models via k-bit quantization for PyTorch.β8,258Updated this week
- π€ PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.β21,226Updated this week
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Efficient, scalable and enterprise-grade CPU/GPU inference server for π€ Hugging Face transformer models πβ1,685Oct 23, 2024Updated last year
- Transformer related optimization, including BERT, GPTβ6,419Mar 27, 2024Updated 2 years ago
- Simple, safe way to store and distribute tensorsβ3,763Updated this week
- π€ The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation toolsβ21,575Updated this week
- Fast and memory-efficient exact attentionβ24,037Updated this week
- π€ Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.β33,775Updated this week
- π₯ Fast State-of-the-Art Tokenizers optimized for Research and Productionβ10,788May 26, 2026Updated last week
- Train transformer language models with reinforcement learning.β18,547Updated this week
- Hackable and optimized Transformers building blocks, supporting a composable construction.β10,484May 21, 2026Updated 2 weeks ago
- Bare Metal GPUs on DigitalOcean Gradient AI β’ AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Efficient few-shot learning with Sentence Transformersβ2,744May 26, 2026Updated last week
- MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.β2,109Jun 30, 2025Updated 11 months ago
- A pytorch quantization backend for optimumβ1,042Updated this week
- π€ Optimum Intel: Accelerate inference with Intel optimization toolsβ593Updated this week
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.β42,412May 30, 2026Updated last week
- TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizatβ¦β13,793Updated this week
- The Triton Inference Server provides an optimized cloud and edge inferencing solution.β10,733Updated this week
- An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.β5,065Apr 11, 2025Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMsβ81,909Updated this week
- Deploy open-source AI quickly and easily - Special Bonus Offer β’ AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, β¦β2,652Updated this week
- A blazing fast inference solution for text embeddings modelsβ4,840May 26, 2026Updated last week
- ONNX Runtime: cross-platform, high performance ML inferencing and training acceleratorβ20,718Updated this week
- Development repository for the Triton language and compilerβ19,313May 31, 2026Updated last week
- PyTorch extensions for high performance and large scale training.β3,408Apr 26, 2025Updated last year
- PyTorch native quantization and sparsity for training and inferenceβ2,841May 30, 2026Updated last week
- State-of-the-Art Embeddings, Retrieval, and Rerankingβ18,758May 28, 2026Updated last week
- Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackabβ¦β1,586Jan 28, 2026Updated 4 months ago
- Foundation Architecture for (M)LLMsβ3,131Apr 11, 2024Updated 2 years ago
- Virtual machines for every use case on DigitalOcean β’ AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Autoβ¦β17,292Updated this week
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.β3,077May 26, 2026Updated last week
- Fast inference engine for Transformer modelsβ4,509Updated this week
- A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on Hβ¦β3,366May 29, 2026Updated last week
- Minimalistic large language model 3D-parallelism trainingβ2,705May 26, 2026Updated last week
- ποΈ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Oβ¦β337May 26, 2026Updated last week
- SGLang is a high-performance serving framework for large language models and multimodal models.β28,512May 31, 2026Updated last week