Hugging Face Inference Toolkit used to serve transformers, sentence-transformers, and diffusers models.
☆90Mar 13, 2026Updated last week
Alternatives and similar repositories for huggingface-inference-toolkit
Users that are interested in huggingface-inference-toolkit are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆21Jan 21, 2026Updated 2 months ago
- ☆24Feb 24, 2026Updated last month
- 🚂 Fine-tune OpenAI models for text classification, question answering, and more☆17May 1, 2023Updated 2 years ago
- AI Energy Score: Initiative to establish comparable energy efficiency ratings for AI models.☆38Dec 2, 2025Updated 3 months ago
- 🤗 HuggingFace Inference Toolkit for Google Cloud Vertex AI (similar to SageMaker's Inference Toolkit, but for Vertex AI and unofficial)☆17Mar 20, 2024Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Starbucks: Improved Training for 2D Matryoshka Embeddings☆22Jun 30, 2025Updated 8 months ago
- 🤝 Trade any tensors over the network☆31Sep 27, 2023Updated 2 years ago
- Github action to connect to tailscale☆19Mar 10, 2026Updated 2 weeks ago
- 🤗 Collection of examples on how to train, deploy and monitor HuggingFace models in Google Cloud Vertex AI☆22Feb 26, 2024Updated 2 years ago
- A framework for few-shot evaluation of language models.☆36Mar 18, 2025Updated last year
- Chunk Dedupe Estimation☆20Nov 5, 2024Updated last year
- Custom fastapi server packaged as docker image for Huggingface inference endpoints deployment☆13Apr 17, 2024Updated last year
- ☆26Nov 18, 2025Updated 4 months ago
- 8-bit floating point types for Rust☆63Feb 4, 2026Updated last month
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Manage scalable open LLM inference endpoints in Slurm clusters☆284Jul 11, 2024Updated last year
- Pixel Parsing. A reproduction of OCR-free end-to-end document understanding models with open data☆23Jul 30, 2024Updated last year
- The Batched API provides a flexible and efficient way to process multiple requests in a batch, with a primary focus on dynamic batching o…☆160Jul 14, 2025Updated 8 months ago
- Python library to use Pleias-RAG models☆68May 1, 2025Updated 10 months ago
- ☆16Jul 23, 2024Updated last year
- Dataset Viber is your chill repo for data collection, annotation and vibe checks.☆47Sep 5, 2024Updated last year
- ☆125Oct 28, 2024Updated last year
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆32Sep 19, 2025Updated 6 months ago
- Baguetter is a flexible, efficient, and hackable search engine library implemented in Python. It's designed for quickly benchmarking, imp…☆209Aug 31, 2024Updated last year
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- [NeurIPS 2024] | An Efficient Recipe for Long Context Extension via Middle-Focused Positional Encoding☆22Oct 10, 2024Updated last year
- Command Line Interface for Hugging Face Inference Endpoints☆65Apr 10, 2024Updated last year
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends☆2,353Mar 9, 2026Updated 2 weeks ago
- Train LLM on Hugging Face infra☆70Nov 13, 2025Updated 4 months ago
- Let's build better datasets, together!☆271Dec 20, 2024Updated last year
- Speech to Speech conversation using the OpenAI RealTime API in Python 🐍☆26Nov 18, 2024Updated last year
- 🌏 Modular retrievers for zero-shot multilingual IR.☆30Mar 6, 2024Updated 2 years ago
- Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)☆207Mar 16, 2026Updated last week
- ☆26Dec 13, 2024Updated last year
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained mode…☆92Oct 10, 2024Updated last year
- ☆16Sep 4, 2025Updated 6 months ago
- Implementation of the dilated self attention as described in "LongNet: Scaling Transformers to 1,000,000,000 Tokens"☆13Jul 23, 2023Updated 2 years ago
- A course on building Large Language Models☆12Mar 24, 2025Updated last year
- Small python package to measure OCR quality and other related metrics.☆27Feb 19, 2024Updated 2 years ago
- Rust client for the huggingface hub aiming for minimal subset of features over `huggingface-hub` python package☆273Updated this week
- A RAG that can scale 🧑🏻💻☆11May 28, 2024Updated last year