mistralai / TensorRT-LLMLinks
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
☆15Updated last year
Alternatives and similar repositories for TensorRT-LLM
Users that are interested in TensorRT-LLM are comparing it to the libraries listed below
Sorting:
- ☆14Updated 2 years ago
- Using GPT-3 and Carrot (GPT-3 for computer vision) to create detailed descriptions of images.☆14Updated 3 years ago
- Simple CogVLM client script☆14Updated last year
- ☆12Updated 6 months ago
- RAG-QA is a free, containerised question-answer framework that allows you to ask questions to your documents in an intuitive way☆19Updated last year
- Demo python script app to interact with llama.cpp server using whisper API, microphone and webcam devices.☆46Updated 2 years ago
- Python text-to-speech library with built-in voice effects and support for multiple TTS engines☆25Updated 8 months ago
- The NVIDIA RTX™ AI Toolkit is a suite of tools and SDKs for Windows developers to customize, optimize, and deploy AI models across RTX PC…☆180Updated 2 weeks ago
- GGUF Quantization of any LLM.☆41Updated last year
- Visual similarity search engine demo with use of PyTorch Metric Learning and Qdrant☆12Updated 2 years ago
- You AI companion. ChatGPT and translation for Monocle AR☆22Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆11Updated 2 years ago
- Chat Complex PDF with Tables Using IBM WatsonX, Langchain and LlamaParser.☆14Updated 2 months ago
- Code for blog posts from OpenCV.AI☆15Updated 2 years ago
- Intuitive graphical representation of source code☆13Updated 2 years ago
- ☆12Updated last year
- Demo combining Whisper for speech recognition and Google TTS for speech synthesis to interact with Alpaca-LoRA.☆20Updated last year
- Deploy DL/ ML inference pipelines with minimal extra code.☆102Updated last year
- Passively collect images for computer vision datasets on the edge.☆35Updated 2 years ago
- Web-based tool to convert model into MyriadX blob☆16Updated 6 months ago
- A ⚡️ Lightning.ai ⚡️ app demo for Voice based web search using OpenAI's Whisper and DuckDuckGo☆27Updated 3 years ago
- Interactive Textbook Demo☆50Updated last month
- The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained mode…☆12Updated last year
- Example Code to Supplement the Label Studio Blog☆30Updated last month
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆23Updated last year
- Convert an audio file to a waveform video☆11Updated 2 years ago
- NVIDIA Fleet Command is a hybrid-cloud platform for securely and remotely deploying, managing, and scaling AI across dozens or up to thou…☆14Updated 3 years ago
- An offline CPU-first low-resource chat application to perform RAG on your corpus of data. Powered by OpenChat and CTranslate2.☆14Updated 6 months ago
- A bot that summarizes AI papers and posts them on twitter☆34Updated last week
- ☆14Updated 2 years ago