Triton backend for https://github.com/OpenNMT/CTranslate2
☆35Jul 7, 2023Updated 2 years ago
Alternatives and similar repositories for ctranslate2_triton_backend
Users that are interested in ctranslate2_triton_backend are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- WIP: Ofen is a toolkit aimed at making transformer models production-ready. API included☆17Oct 2, 2024Updated last year
- Triton backend for https://github.com/OpenNMT/CTranslate2☆11Aug 20, 2024Updated last year
- Integrating SSE with NVIDIA Triton Inference Server using a Python backend and Zephyr model. There is very less documentation how to use …☆10May 29, 2024Updated last year
- Create TensorRT-runtime for Retinaface☆16Dec 4, 2021Updated 4 years ago
- An offline CPU-first low-resource chat application to perform RAG on your corpus of data. Powered by OpenChat and CTranslate2.☆14May 14, 2025Updated 11 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Magface Triton Inferece Server Using Tensorrt☆18Feb 12, 2022Updated 4 years ago
- ☆12Feb 22, 2024Updated 2 years ago
- PyTriton is a Flask/FastAPI-like interface that simplifies Triton's deployment in Python environments.☆842Aug 13, 2025Updated 8 months ago
- No Language Left Unlocked: scalable backtranslation of NLLB models☆14Aug 4, 2025Updated 8 months ago
- Apache Arrow-compatible space-efficient "tape" class in pure Rust to be used with StringZilla for GPU, NUMA, and disk transfers of variab…☆29Nov 21, 2025Updated 4 months ago
- This project simply uses torchvision pretrained model to finetune and classify whether an image is anime or reality☆15Aug 11, 2025Updated 8 months ago
- Official code for "Binary embedding based retrieval at Tencent"☆44Mar 7, 2024Updated 2 years ago
- mixedbread ai python sdk☆12Jul 1, 2024Updated last year
- Connecting Transformers on HuggingFace Hub with CTranslate2☆39Aug 27, 2024Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Multi-Agent Reinforcement Learning Environment for the card game SkyJo, compatible with PettingZoo and RLLIB☆16Feb 21, 2026Updated last month
- Final training script from HuggingFace Whisper Fine tuning event - to get best results on finetuned model.☆12Dec 24, 2022Updated 3 years ago
- ☆12Apr 28, 2023Updated 2 years ago
- One command to start a streaming ASR server.☆12Oct 2, 2024Updated last year
- 한국어 문장 분석 시스템 BCD-KL-Parser☆10Jun 23, 2020Updated 5 years ago
- Dense Passage Retrieval using tensorflow-keras on TPU☆17Jun 27, 2021Updated 4 years ago
- ☆14Jun 25, 2024Updated last year
- Keyword extraction using Scake, KeyBERT, Fine-tuning Transformer BERT-like models and ChatGPT.☆12May 22, 2023Updated 2 years ago
- Showcase how mxbai-embed-large-v1 can be used to produce binary embedding. Binary embeddings enabled 32x storage savings and 40x faster r…☆19Mar 23, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- HTSP client for python☆10Jan 28, 2022Updated 4 years ago
- ☆13May 28, 2013Updated 12 years ago
- This repository is designed for deploying and managing server processes that handle embeddings using the Infinity Embedding model or Larg…☆26Mar 6, 2025Updated last year
- Codebase, data and models for the Headline Grouping paper at NAACL2021☆12Oct 2, 2022Updated 3 years ago
- A Rust crate offering similar functionality to the Python transformers package using Candle.☆14Nov 19, 2024Updated last year
- Reading comprehension based question-answering model for news articles.☆11Jun 22, 2022Updated 3 years ago
- Code for the MTEB Arena☆24Jul 2, 2025Updated 9 months ago
- So You want to build a GenAI Agent on Google Cloud?☆17Mar 7, 2026Updated last month
- The Batched API provides a flexible and efficient way to process multiple requests in a batch, with a primary focus on dynamic batching o…☆160Jul 14, 2025Updated 9 months ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- end-to-end information extraction pipeline built by LayoutLMV2, pretrained model from HuggingFace☆11Aug 15, 2023Updated 2 years ago
- Open source RAG with Llama Index for Japanese LLM in low resource settting☆10May 12, 2025Updated 11 months ago
- Building or integrating an LLM wrapper shouldn't take more than 10 minutes.☆13Feb 1, 2025Updated last year
- YOLOv5 application on detection of dogs and cats.☆18Mar 14, 2023Updated 3 years ago
- ☆18Mar 19, 2023Updated 3 years ago
- Golang SDK for Truss☆40Apr 8, 2026Updated last week
- Hugging Face RoBERTa with Flash Attention 2☆24Sep 14, 2025Updated 7 months ago