NVIDIA / ChatRTX
A developer reference project for creating Retrieval Augmented Generation (RAG) chatbots on Windows using TensorRT-LLM
☆2,935Updated 7 months ago
Alternatives and similar repositories for ChatRTX:
Users that are interested in ChatRTX are comparing it to the libraries listed below
- Run Mixtral-8x7B models in Colab or consumer desktops☆2,297Updated 11 months ago
- Home of StarCoder2!☆1,894Updated last year
- ⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Pl…☆2,164Updated 5 months ago
- ☆943Updated last month
- TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain…☆9,975Updated this week
- Large Language Model Text Generation Inference☆9,941Updated this week
- Examples in the MLX framework☆7,215Updated this week
- Foundational model for human-like, expressive TTS☆4,076Updated 8 months ago
- A fast inference library for running LLMs locally on modern consumer-class GPUs☆4,078Updated 2 weeks ago
- Official Pytorch repository for Extreme Compression of Large Language Models via Additive Quantization https://arxiv.org/pdf/2401.06118.p…☆1,228Updated 3 weeks ago
- Training LLMs with QLoRA + FSDP☆1,467Updated 4 months ago
- All things prompt engineering☆5,583Updated 9 months ago
- OpenCodeInterpreter is a suite of open-source code generation systems aimed at bridging the gap between large language models and sophist…☆1,637Updated 10 months ago
- Tools for merging pretrained large language models.☆5,478Updated this week
- High-speed Large Language Model Serving for Local Deployment☆8,169Updated last month
- Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.☆2,956Updated last week
- Connect home devices into a powerful cluster to accelerate LLM inference. More devices means faster inference.☆1,993Updated last week
- Generative AI extensions for onnxruntime☆665Updated this week
- Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.☆5,898Updated 2 weeks ago
- A series of large language models trained from scratch by developers @01-ai☆7,832Updated 4 months ago
- ☆2,475Updated last week
- An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.☆4,769Updated last week
- official repository of aiXcoder-7B Code Large Language Model☆2,255Updated 2 months ago
- AIOS: AI Agent Operating System☆3,992Updated last week
- Set of tools to assess and improve LLM security.☆2,983Updated last month
- Olive: Simplify ML Model Finetuning, Conversion, Quantization, and Optimization for CPUs, GPUs and NPUs.☆1,849Updated this week
- ☆1,537Updated 11 months ago
- ☆8,606Updated 5 months ago
- [ICML'24] Magicoder: Empowering Code Generation with OSS-Instruct☆2,003Updated 4 months ago
- Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.☆3,014Updated 2 weeks ago