zhaohb/fastapi_tritonserver

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/zhaohb/fastapi_tritonserver)

zhaohb / fastapi_tritonserver

☆28

Alternatives and similar repositories for fastapi_tritonserver

Users that are interested in fastapi_tritonserver are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

MACNICA-CLAVIS-NV / yolov5-triton
View on GitHub
YOLO v5 Object Detection on Triton Inference Server
☆17Mar 30, 2023Updated 3 years ago
triton-inference-server / openvino_backend
View on GitHub
OpenVINO backend for Triton.
☆38Updated this week
npuichigo / openai_trtllm
View on GitHub
OpenAI compatible API for TensorRT LLM triton backend
☆221Aug 1, 2024Updated last year
hkmujj / stop_words
View on GitHub
公安网备敏感词过滤词
☆14Oct 7, 2018Updated 7 years ago
SkyworkAI / vllm
View on GitHub
A high-throughput and memory-efficient inference and serving engine for LLMs
☆17Jun 3, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
zhg-SZPT / FastSAM_Awsome_Openvino
View on GitHub
"FastSAM_Awsome_Openvino" 项目展示了如何通过 OpenVINO 框架高效部署 FastSAM 模型，实现了令人瞩目的实例分割功能。该项目提供了 C++ 版本和 Python 版本两种实现，为开发者提供了在不同语言环境下使用 FastSAM 模型的选…
☆37Dec 13, 2023Updated 2 years ago
mlwithme / doccano
View on GitHub
Open source text annotation tool for machine learning practitioner.
☆12Dec 30, 2020Updated 5 years ago
isLinXu / regex-tokenizer
View on GitHub
Converted the Jina Tokenizer regex pattern to python.
☆25Jun 10, 2026Updated last month
Issues-translate-bot / issues-translate-action
View on GitHub
The action for translating non-English issues content to English.
☆12Dec 16, 2020Updated 5 years ago
triton-inference-server / tensorrt_backend
View on GitHub
The Triton backend for TensorRT.
☆90Updated this week
lrw04 / tinyllamas-ncnn
View on GitHub
Inference TinyLlama models on ncnn
☆24Aug 15, 2023Updated 2 years ago
blib-la / ask-poddy
View on GitHub
Ask Poddy: Run Open Source LLMs and Embeddings as OpenAI-Compatible Serverless Endpoints (Tutorial)
☆11Jul 19, 2024Updated 2 years ago
dgg32 / age_vector
View on GitHub
☆14Sep 18, 2024Updated last year
Kazuhito00 / MobileSAM-ONNX-Sample
View on GitHub
MobileSAM のエンコーダー/デコーダーをONNXに変換し、推論するサンプル
☆12Apr 11, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
TroyDoesAI / AI_Research
View on GitHub
My Gen AI research
☆11Jun 3, 2024Updated 2 years ago
FeiGeChuanShu / trt2023
View on GitHub
NVIDIA TensorRT Hackathon 2023复赛选题：通义千问Qwen-7B用TensorRT-LLM模型搭建及优化
☆43Oct 20, 2023Updated 2 years ago
joszz / HyperVAdmin
View on GitHub
A simple website to manage your Hyper-V VMs and IIS sites
☆12Jan 19, 2023Updated 3 years ago
Franc-Z / QWen1.5_TensorRT-LLM
View on GitHub
Optimize QWen1.5 models with TensorRT-LLM
☆17May 14, 2024Updated 2 years ago
Tlntin / ChatGLM2-6B-TensorRT
View on GitHub
☆90Jun 30, 2023Updated 3 years ago
ParisNeo / vllm_proxy_server
View on GitHub
A vllm proxy server to add security and multi model management for vllm servers
☆12May 30, 2024Updated 2 years ago
abvijaykumar / python-lora-finetuning
View on GitHub
Finetuning a codegen model with python instruction set using QLORA technique for better efficacy
☆11Aug 31, 2023Updated 2 years ago
27182812 / ChineseBERT_paddle
View on GitHub
用Paddle复现论文ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information（ACL2021）
☆10Nov 15, 2021Updated 4 years ago
BiboyQG / dotfiles
View on GitHub
☆18Updated this week
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
mawenxing / xiaofeiji
View on GitHub
小飞机翻墙教程
☆24Nov 14, 2019Updated 6 years ago
daixiangzi / Caffe-PCN
View on GitHub
Detect multi angle face by pcn ,and Crop detected face
☆28Jun 12, 2019Updated 7 years ago
ShuaiGuo16 / language_learning_app
View on GitHub
A dual-chatbot system for learning languages based on LangChain
☆13Jun 25, 2023Updated 3 years ago
aahouzi / llama2-chatbot-cpu
View on GitHub
A LLaMA2-7b chatbot with memory running on CPU, and optimized using smooth quantization, 4-bit quantization or Intel® Extension For PyTor…
☆15Feb 27, 2024Updated 2 years ago
apachecn / pyminer-dev-guide
View on GitHub
PyMiner 开发者指南
☆12Mar 19, 2022Updated 4 years ago
hma02 / cublasHgemm-P100
View on GitHub
Code for testing the native float16 matrix multiplication performance on Tesla P100 and V100 GPU based on cublasHgemm
☆35Aug 20, 2019Updated 6 years ago
elevenlee / dcn_streaming_video
View on GitHub
In this programming assignment you will implement a streaming video server and client that communicate control commands via the Real-Time…
☆11Dec 29, 2012Updated 13 years ago
alexsnow348 / learning-notes
View on GitHub
All the learning notes related to my interests such as ml, ai, blockchain, deutsch and burmese language
☆10Mar 2, 2026Updated 4 months ago
pogevip / BERT4TensorRT
View on GitHub
TensorRT
☆11Sep 22, 2020Updated 5 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
althayr / Document-Layout-Parser
View on GitHub
Parses a document (scanned or phone captured) and returns the underlying question - answer layout structured capture by LayoutXLM model
☆10Jun 14, 2021Updated 5 years ago
GalinaRejoice / learning-cuda-trt
View on GitHub
☆12Jan 25, 2023Updated 3 years ago
aws-samples / genai-at-edge
View on GitHub
☆14Jun 11, 2024Updated 2 years ago
LeeJuly30 / BERTCpp
View on GitHub
implement bert in pure c++
☆37Apr 29, 2020Updated 6 years ago
Bobo-y / triton_ensemble_model_demo
View on GitHub
triton server ensemble model demo
☆30May 2, 2022Updated 4 years ago
will-wiki / softmasked-bert
View on GitHub
中文soft-masked bert文本纠错复现
☆21May 20, 2021Updated 5 years ago
amd-enterprise-ai / aim-build
View on GitHub
☆15Jul 14, 2026Updated last week