janhq / cortex.tensorrt-llm
Cortex.Tensorrt-LLM is a C++ inference library that can be loaded by any server at runtime. It submodules NVIDIA’s TensorRT-LLM for GPU accelerated inference on NVIDIA's GPUs.
☆42Updated 3 months ago
Alternatives and similar repositories for cortex.tensorrt-llm:
Users that are interested in cortex.tensorrt-llm are comparing it to the libraries listed below
- run ollama & gguf easily with a single command☆49Updated 8 months ago
- A Windows tool to query various LLM AIs. Supports branched conversations, history and summaries among others.☆28Updated this week
- A fast batching API to serve LLM models☆177Updated 8 months ago
- Serving LLMs in the HF-Transformers format via a PyFlask API☆68Updated 4 months ago
- ☆18Updated 3 weeks ago
- Tcurtsni: Reverse Instruction Chat, ever wonder what your LLM wants to ask you?☆22Updated 6 months ago
- Easily view and modify JSON datasets for large language models☆68Updated 3 months ago
- idea: https://github.com/nyxkrage/ebook-groupchat/☆84Updated 4 months ago
- An unsupervised model merging algorithm for Transformers-based language models.☆101Updated 8 months ago
- After my server ui improvements were successfully merged, consider this repo a playground for experimenting, tinkering and hacking around…☆56Updated 5 months ago
- Low-Rank adapter extraction for fine-tuned transformers models☆165Updated 8 months ago
- Very basic framework for parameterized large language model (Q)LoRA / (Q)Dora fine-tuning using mlx, mlx_lm, and OgbujiPT. Architecture …☆36Updated this week
- ☆107Updated 3 weeks ago
- The hearth of The Pulsar App, fast, secure and shared inference with modern UI☆42Updated last month
- For inferring and serving local LLMs using the MLX framework☆90Updated 9 months ago
- automatically quant GGUF models☆150Updated this week
- Easily convert HuggingFace models to GGUF-format for llama.cpp☆21Updated 5 months ago
- A pipeline parallel training script for LLMs.☆116Updated this week
- Demo of an "always-on" AI assistant.☆23Updated 11 months ago
- cli tool to quantize gguf, gptq, awq, hqq and exl2 models☆66Updated last month
- An extension that lets the AI take the wheel, allowing it to use the mouse and keyboard, recognize UI elements, and prompt itself :3...no…☆103Updated 2 months ago
- An API for VoiceCraft.☆26Updated 6 months ago
- Implements harmful/harmless refusal removal using pure HF Transformers☆109Updated 7 months ago
- GPT-4 Level Conversational QA Trained In a Few Hours☆58Updated 4 months ago
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆30Updated 5 months ago
- V.I.S.O.R., my in-development AI-powered voice assistant with integrated memory!☆30Updated last month
- Easy to use, High Performant Knowledge Distillation for LLMs☆38Updated last week
- B-Llama3o a llama3 with Vision Audio and Audio understanding as well as text and Audio and Animation Data output.☆26Updated 7 months ago
- ☆65Updated 7 months ago
- ☆24Updated 10 months ago