This reference can be used with any existing OpenAI integrated apps to run with TRT-LLM inference locally on GeForce GPU on Windows instead of cloud.
☆128Feb 29, 2024Updated 2 years ago
Alternatives and similar repositories for trt-llm-as-openai-windows
Users that are interested in trt-llm-as-openai-windows are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A developer reference project for creating Retrieval Augmented Generation (RAG) chatbots on Windows using TensorRT-LLM☆3,127Jan 21, 2026Updated 5 months ago
- OpenAI compatible API for TensorRT LLM triton backend☆221Aug 1, 2024Updated last year
- ☆13Feb 18, 2024Updated 2 years ago
- All-in-one Full-Featured Python/Flet/Flutter Application to make the most of all the latest Open-Source AI Art Generators in an intuitive…☆16May 30, 2025Updated last year
- Text-Guided Generation of Full-Body Image with Preserved Reference Face for Customized Animation☆24Jun 24, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Flask API for generating text embeddings using OpenAI or sentence_transformers☆14Sep 1, 2023Updated 2 years ago
- ☆30Aug 21, 2024Updated last year
- Run Vision LLMs, TTS and STT APIs. Website and API for https://text-generator.io☆39May 8, 2026Updated last month
- An OpenAI API compatible images server to generate or manipulate images.☆18Feb 2, 2025Updated last year
- ☆25Feb 18, 2024Updated 2 years ago
- GenAI Examples☆16Dec 13, 2024Updated last year
- Implementation of the paper: "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" from Google in pyTO…☆59Updated this week
- ☆14Nov 22, 2024Updated last year
- TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizat…☆13,994Updated this week
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- experiments with inference on llama☆103Jun 6, 2024Updated 2 years ago
- A one-stop library to standardize the inference and evaluation of all the conditional video generation models.☆51Feb 13, 2025Updated last year
- This is the repo with the code to conduct a comparative analysis of different audio representation models.☆12Aug 31, 2023Updated 2 years ago
- Official Repo for MoCha Towards Movie-Grade Talking Character Synthesis☆62Dec 27, 2025Updated 6 months ago
- Copier template for creating a Mopidy extension☆17May 16, 2026Updated last month
- ☆13Feb 18, 2023Updated 3 years ago
- ☆25Mar 6, 2024Updated 2 years ago
- 2D python IDE built into blender as an add-on!☆21Sep 16, 2024Updated last year
- ☆24Mar 10, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Scriptable interface to a powerful, multi-lingual language server☆44Jun 22, 2026Updated last week
- The NVIDIA RTX™ AI Toolkit is a suite of tools and SDKs for Windows developers to customize, optimize, and deploy AI models across RTX PC…☆184Nov 24, 2025Updated 7 months ago
- Offline version of the NanoPy Editor. No server needed.☆14Mar 7, 2025Updated last year
- ☆76Mar 10, 2026Updated 3 months ago
- ☆72Apr 4, 2025Updated last year
- A simple node to download repos from HF specify a repo ID or File create a folder where you want to download the files then rename the fo…☆25Jul 14, 2025Updated 11 months ago
- Original implementation of "FLoD: Integrating Flexible Level of Detail into 3D Gaussian Splatting for Customizable Rendering"☆29Sep 1, 2025Updated 9 months ago
- Cortex.Tensorrt-LLM is a C++ inference library that can be loaded by any server at runtime. It submodules NVIDIA’s TensorRT-LLM for GPU a…☆42Sep 26, 2024Updated last year
- ☆63Nov 8, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- 🪐 🎛️ User interface to manage your Jupyter platform.☆17Apr 30, 2026Updated last month
- ☆40May 10, 2024Updated 2 years ago
- Build beautiful marketing sites with SvelteKit - A lightweight, customizable component library for static marketing websites.☆19Mar 28, 2025Updated last year
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io☆16Apr 18, 2024Updated 2 years ago
- ☆19Jan 17, 2025Updated last year
- The repo for: TriHuman: A Real-time and Controllable Tri-plane Representation for Detailed Human Geometry and Appearance Synthesis☆19Nov 15, 2025Updated 7 months ago
- ESG Insights AI simplifies ESG data analysis with advanced AI models, ensuring compliance with GRI standards. It helps asset managers ass…☆13Oct 31, 2024Updated last year