A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.
β32Sep 19, 2025Updated 7 months ago
Alternatives and similar repositories for py-txi
Users that are interested in py-txi are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The backend behind the LLM-Perf Leaderboardβ11May 5, 2024Updated last year
- π€ Tokenizers.js: A pure JS/TS implementation of today's most used tokenizersβ47Mar 18, 2026Updated last month
- π€ Optimum ONNX: Export your model to ONNX and run inference with ONNX Runtimeβ136Updated this week
- β13Mar 30, 2026Updated 2 weeks ago
- Monika is an AI assistant that combines speech-to-text, natural language processing, and text-to-speech capabilities for seamless interacβ¦β26Mar 31, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Implementation of the dilated self attention as described in "LongNet: Scaling Transformers to 1,000,000,000 Tokens"β13Jul 23, 2023Updated 2 years ago
- Are foundation LMs multilingual knowledge bases? (EMNLP 2023)β19Dec 8, 2023Updated 2 years ago
- German Alpaca Dataset (Cleaned + Translated)β26Apr 6, 2023Updated 3 years ago
- A RAG that can scale π§π»βπ»β11May 28, 2024Updated last year
- π€ Collection of examples on how to train, deploy and monitor HuggingFace models in Google Cloud Vertex AIβ22Feb 26, 2024Updated 2 years ago
- π LLM inference optimization simulator, modeling compute-bound prefill and memory-bound decode phases.β14Jul 12, 2025Updated 9 months ago
- German dataset for DPR model trainingβ19Jul 21, 2024Updated last year
- Fast search index for SPLADE sparse retrieval models implemented in Python using Numpy and Numbaβ38Oct 16, 2025Updated 6 months ago
- Command Line Interface for Hugging Face Inference Endpointsβ65Apr 10, 2024Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI β’ AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- β48Nov 8, 2023Updated 2 years ago
- Manage scalable open LLM inference endpoints in Slurm clustersβ284Jul 11, 2024Updated last year
- π€ Trade any tensors over the networkβ31Sep 27, 2023Updated 2 years ago
- Chunk your text using gpt4o-mini more accuratelyβ44Aug 3, 2024Updated last year
- Public repository for the LLM course at MVAβ38Mar 13, 2026Updated last month
- ποΈ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Oβ¦β334Apr 3, 2026Updated 2 weeks ago
- π€ HuggingFace Inference Toolkit for Google Cloud Vertex AI (similar to SageMaker's Inference Toolkit, but for Vertex AI and unofficial)β17Mar 20, 2024Updated 2 years ago
- A framework for evaluating semantic search across custom datasets, metrics, and embedding backends.β38May 26, 2025Updated 10 months ago
- Sparse Embedding Compression for Scalable Retrieval in Recommender Systemsβ35Nov 21, 2025Updated 4 months ago
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Semantically Search Emojis From the Command Line!β13Nov 26, 2023Updated 2 years ago
- Confusion Matrix in Python: plot a pretty confusion matrix (like Matlab) in python using seaborn and matplotlibβ19Nov 19, 2021Updated 4 years ago
- Efficiently find the best-suited language model (LM) for your NLP taskβ135Jul 26, 2025Updated 8 months ago
- Generate interleaved text and image content in a structured format you can directly pass to downstream APIs.β29Oct 18, 2024Updated last year
- Starbucks: Improved Training for 2D Matryoshka Embeddingsβ23Jun 30, 2025Updated 9 months ago
- Visualize expert firing frequencies across sentences in the Mixtral MoE modelβ18Dec 22, 2023Updated 2 years ago
- Article about deploying machine learning models using grpc, pytorch and asyncioβ29Nov 18, 2022Updated 3 years ago
- A Python module for retrieving script types of writing systems including alphabets, abjads, abugidas, syllabaries, logographs, featurals β¦β15Jul 19, 2024Updated last year
- Benchmark various LLM Structured Output frameworks: Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, etc on taskβ¦β186Sep 23, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean β’ AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- πΈ GlotCC Dataset and Pipline -- NeurIPS 2024β20Apr 6, 2025Updated last year
- Benchmark structured generation librariesβ31Oct 25, 2024Updated last year
- A proposed standard `NOCK` for a Parquet format that supports efficient distributed serialization of multiple kinds of graph technologiesβ21Oct 24, 2022Updated 3 years ago
- Accurate word segmentation for hashtags and text, powered by Transformers and Beam Search. A scalable alternative to heuristic splitters β¦β77Jan 8, 2026Updated 3 months ago
- Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)β81Aug 30, 2023Updated 2 years ago
- Python library for automatic training, optimization and comparison of Transformer models on most NLP tasks.β20May 6, 2023Updated 2 years ago
- Deploy a FastHTML app in just a few lines of simple python code on Modal's serverless infra.β26Aug 19, 2024Updated last year