π€ HuggingFace Inference Toolkit for Google Cloud Vertex AI (similar to SageMaker's Inference Toolkit, but for Vertex AI and unofficial)
β17Mar 20, 2024Updated 2 years ago
Alternatives and similar repositories for vertex-ai-huggingface-inference-toolkit
Users that are interested in vertex-ai-huggingface-inference-toolkit are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A Rust crate offering similar functionality to the Python transformers package using Candle.β14Nov 19, 2024Updated last year
- π€ Trade any tensors over the networkβ31Sep 27, 2023Updated 2 years ago
- π Fine-tune OpenAI models for text classification, question answering, and moreβ17May 1, 2023Updated 2 years ago
- A proposed standard `NOCK` for a Parquet format that supports efficient distributed serialization of multiple kinds of graph technologiesβ21Oct 24, 2022Updated 3 years ago
- π A Python package template using pyproject.toml, hatch, pre-commit, black, ruff, and mkdocs.β59Sep 7, 2023Updated 2 years ago
- GPUs on demand by Runpod - Special Offer Available β’ AdRun AI, ML, and HPC workloads on powerful cloud GPUsβwithout limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- π€ Collection of examples on how to train, deploy and monitor HuggingFace models in Google Cloud Vertex AIβ22Feb 26, 2024Updated 2 years ago
- INCOME: An Easy Repository for Training and Evaluation of Index Compression Methods in Dense Retrieval. Includes BPR and JPQ.β24Sep 24, 2023Updated 2 years ago
- A RAG that can scale π§π»βπ»β11May 28, 2024Updated last year
- β10Oct 2, 2024Updated last year
- Fast search index for SPLADE sparse retrieval models implemented in Python using Numpy and Numbaβ38Oct 16, 2025Updated 6 months ago
- YASEM - Yet Another Splade|Sparse Embedder - A simple and efficient library for SPLADE embeddingsβ13May 22, 2025Updated 10 months ago
- β17Jan 5, 2023Updated 3 years ago
- Hugging Face Inference Toolkit used to serve transformers, sentence-transformers, and diffusers models.β93Apr 7, 2026Updated last week
- β17Sep 9, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- β25Updated this week
- Sparse Embedding Compression for Scalable Retrieval in Recommender Systemsβ35Nov 21, 2025Updated 4 months ago
- Semantically Search Emojis From the Command Line!β13Nov 26, 2023Updated 2 years ago
- A missing piece of the Python multitask (both threads and processes) API: An extension that supports stateful worker pools & size-aware iβ¦β29Mar 8, 2026Updated last month
- Starbucks: Improved Training for 2D Matryoshka Embeddingsβ23Jun 30, 2025Updated 9 months ago
- A Python module for retrieving script types of writing systems including alphabets, abjads, abugidas, syllabaries, logographs, featurals β¦β15Jul 19, 2024Updated last year
- [NeurIPS 2024] πΈ GlotCC Dataset and Piplineβ20Apr 6, 2025Updated last year
- Official PyTorch Implementation for the "What if...?: Thinking Counterfactual Keywords Helps to Mitigate Hallucination in Large Multi-modβ¦β20Sep 26, 2024Updated last year
- Command Line Interface for Hugging Face Inference Endpointsβ65Apr 10, 2024Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Keyphrase Extraction Prototypesβ15Nov 24, 2016Updated 9 years ago
- SpanMarker for Named Entity Recognitionβ469Apr 10, 2026Updated last week
- Minimum Bayes Risk Decoding for Hugging Face Transformersβ60Jun 3, 2024Updated last year
- Building or integrating an LLM wrapper shouldn't take more than 10 minutes.β13Feb 1, 2025Updated last year
- Library for evaluating RAG using Nuclia's modelsβ18Jul 31, 2024Updated last year
- Implementation of the dilated self attention as described in "LongNet: Scaling Transformers to 1,000,000,000 Tokens"β13Jul 23, 2023Updated 2 years ago
- β33May 18, 2025Updated 11 months ago
- Smart commit messagesβ18Oct 25, 2024Updated last year
- ApertureDB Python Clientβ12Apr 10, 2026Updated last week
- Wordpress hosting with auto-scaling - Free Trial β’ AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- GitHub action that'll sync files from a GitHub Repo with the Hugging Face Hub π€β80Oct 30, 2024Updated last year
- Convert datasets from Hugging Face to FiftyOne for Visualizationβ11Mar 15, 2024Updated 2 years ago
- π Modular retrievers for zero-shot multilingual IR.β30Mar 6, 2024Updated 2 years ago
- Rank-DistiLLM: Closing the Effectiveness Gap Between Cross-Encoders and LLMs for Passage Re-Rankingβ25Apr 4, 2025Updated last year
- Model implementation for the contextual embeddings projectβ47Jun 2, 2025Updated 10 months ago
- GitHub repository linked to AnimeBackgroundGAN HuggingFace Spaceβ10May 24, 2022Updated 3 years ago
- A C++ library for working with OWL2 ontologies.β12Jan 26, 2016Updated 10 years ago