Cortex.Tensorrt-LLM is a C++ inference library that can be loaded by any server at runtime. It submodules NVIDIA’s TensorRT-LLM for GPU accelerated inference on NVIDIA's GPUs.
☆42Sep 26, 2024Updated last year
Alternatives and similar repositories for cortex.tensorrt-llm
Users that are interested in cortex.tensorrt-llm are comparing it to the libraries listed below
Sorting:
- Efficient Finetuning for OpenAI GPT-OSS☆23Oct 2, 2025Updated 4 months ago
- Inference Llama/Llama2/Llama3 Modes in NumPy☆21Nov 22, 2023Updated 2 years ago
- Attempt at cog wrapper for nightmareai/real-esrgan for larger images☆16Sep 28, 2023Updated 2 years ago
- Local AI API Platform☆2,758Jul 4, 2025Updated 7 months ago
- ☆18Updated this week
- LoadHound — Lightweight load testing tool for SQL databases.☆17Aug 8, 2025Updated 6 months ago
- GUI for GHRepoSearcher. It allows to search online repositories on github.☆10May 20, 2022Updated 3 years ago
- ☆18Jan 12, 2026Updated last month
- Sift client libraries and protocol buffers☆17Updated this week
- OpenVINO Tokenizers extension☆49Updated this week
- Qt/Qml application using Google speech-to-text API to make voice commands☆11Jan 19, 2020Updated 6 years ago
- ☆10Aug 18, 2025Updated 6 months ago
- A Sketch plugin that simulates the three more common forms of color blindness☆12Feb 27, 2015Updated 11 years ago
- Python wrapper for the energy system optimization framework IESopt.☆18Updated this week
- The Process Watchdog is a Linux-based utility designed to start, monitor and manage processes specified in a configuration file. It ensur…☆11Dec 27, 2025Updated 2 months ago
- Files for the Kubecon EU 2025 Tutorial - Hacking up a Storm☆14Apr 4, 2025Updated 10 months ago
- ☆12Jan 20, 2024Updated 2 years ago
- ☆15Oct 24, 2023Updated 2 years ago
- ☆15Jun 9, 2023Updated 2 years ago
- port MaxRAMPercentage to Golang, adjust GC parameters(SetGCPercent/SetMemoryLimit) based on the target memory usage percentage, optimize …☆11Nov 25, 2024Updated last year
- Extract ESCO skills and ISCO occupations from texts such as job descriptions or CVs☆23Jun 12, 2025Updated 8 months ago
- Browse Lance tables from your local machine in a simple web UI. No database to set up. Mount a folder and go.☆21Updated this week
- RWKV is a RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best …☆10Nov 3, 2023Updated 2 years ago
- ☆20Feb 12, 2026Updated 2 weeks ago
- A JSON-RPC 2.0 server, loosely coupled to Django (1.4 - 1.6+)☆18Feb 12, 2013Updated 13 years ago
- A repository aimed at sharing links to climate-related resources.☆12Feb 18, 2026Updated last week
- A Discord bot that answers questions about Replicate.☆16Jan 5, 2024Updated 2 years ago
- expressively bare language☆16Jul 31, 2025Updated 7 months ago
- ☆17Updated this week
- Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees" adapted for Llama models☆41Aug 4, 2023Updated 2 years ago
- Asynchronous Python client for the Syncthing REST API☆10Jan 19, 2026Updated last month
- Symbian Dropbox Client☆14Jul 22, 2022Updated 3 years ago
- A C++ library for RFC2822 message parsing and creation☆14Aug 29, 2020Updated 5 years ago
- A Kubernetes operator for managing Prefect servers and work pools☆17Updated this week
- Generic Updater for Windows written in C++, with no external dependency.☆11Feb 12, 2026Updated 2 weeks ago
- Unstructured.io API GUI☆11Aug 6, 2023Updated 2 years ago
- A Rust implementation of the OCI artifact specification for WebAssembly☆11Feb 5, 2026Updated 3 weeks ago
- 知识图谱推理 复现论文 https://arxiv.org/pdf/2010.04029.pdf☆11Oct 26, 2022Updated 3 years ago
- A data extraction example showing how to get a pdf's content.☆12Feb 24, 2021Updated 5 years ago