Cortex.Tensorrt-LLM is a C++ inference library that can be loaded by any server at runtime. It submodules NVIDIA’s TensorRT-LLM for GPU accelerated inference on NVIDIA's GPUs.
☆42Sep 26, 2024Updated last year
Alternatives and similar repositories for cortex.tensorrt-llm
Users that are interested in cortex.tensorrt-llm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- cortex.llamacpp is a high-efficiency C++ inference engine for edge computing. It is a dynamic library that can be loaded by any server a…☆44Jul 4, 2025Updated 11 months ago
- ☆21Mar 25, 2025Updated last year
- Efficient Finetuning for OpenAI GPT-OSS☆24Oct 2, 2025Updated 8 months ago
- A small utility library for parsing GGUF file info☆29Jan 27, 2025Updated last year
- Attempt at cog wrapper for nightmareai/real-esrgan for larger images☆16Sep 28, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Local AI API Platform☆2,758Jul 4, 2025Updated 11 months ago
- ☆12Nov 8, 2023Updated 2 years ago
- A Deeplearn Model to rec table in photo with ncnn. 一个深度学习模型用于检测图片中的表格 画像内のテーブルを検出するためのディープラーニング モデル☆20Mar 2, 2025Updated last year
- A template for running Stable Diffusion 3 with Cog☆14Aug 20, 2024Updated last year
- Moved to here: https://github.com/lyogavin/airllm☆32Aug 1, 2024Updated last year
- Attempt at cog wrapper for a SDXL CLIP Interrogator☆10May 16, 2024Updated 2 years ago
- Shell scripts for automated transcription on macOS: Integrates whisper.cpp with QuickTime Player and BlackHole-2ch for streamlined audio …☆24Feb 12, 2025Updated last year
- Pay attention to what you're paying attention to.☆29May 17, 2022Updated 4 years ago
- Attempt at cog wrapper for SDXL Controlnet - Canny☆13Nov 25, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- SeeSR: Towards Semantics-Aware Real-World Image Super-Resolution☆14Jan 12, 2024Updated 2 years ago
- The largest open source arabic words list☆15Oct 18, 2021Updated 4 years ago
- ☆21Feb 20, 2023Updated 3 years ago
- fast-embeddings-api☆16Nov 23, 2023Updated 2 years ago
- ☆13Jul 23, 2024Updated last year
- ☆29May 27, 2026Updated 2 weeks ago
- Cog wrapper for canopylabs/orpheus-3b-0.1-ft☆22Mar 20, 2025Updated last year
- A cog implementation of Nvidia's Triton server☆18Oct 23, 2024Updated last year
- A Sketch plugin that simulates the three more common forms of color blindness☆12Feb 27, 2015Updated 11 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆21Jan 15, 2026Updated 5 months ago
- Taming Stable Diffusion for Lip Sync!☆17Mar 18, 2025Updated last year
- Cog wrapper for PASD Magnify☆17Jan 8, 2024Updated 2 years ago
- ☆21Mar 3, 2025Updated last year
- Cog wrapper for collabora/WhisperSpeech☆25Mar 5, 2024Updated 2 years ago
- ☆18Mar 19, 2023Updated 3 years ago
- ☆14Apr 23, 2024Updated 2 years ago
- Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees" adapted for Llama models☆41Aug 4, 2023Updated 2 years ago
- ☆14May 25, 2023Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- llama INT4 cuda inference with AWQ☆54Jan 20, 2025Updated last year
- ☆15Jun 9, 2023Updated 3 years ago
- Sampling techniques for Candle.☆21Apr 3, 2024Updated 2 years ago
- ☆10Aug 18, 2025Updated 9 months ago
- An interactive one-pager browser game, primarily built with React.js. It makes use of the state-of-the-art Jurassic-2 language models to …☆16Sep 13, 2023Updated 2 years ago
- port MaxRAMPercentage to Golang, adjust GC parameters(SetGCPercent/SetMemoryLimit) based on the target memory usage percentage, optimize …☆13Nov 25, 2024Updated last year
- Scalable Kubernetes-native implementation of the Open Data Fabric protocol for global collaborative data processing☆23Jun 4, 2026Updated last week