NVIDIA / ChatRTXLinks
A developer reference project for creating Retrieval Augmented Generation (RAG) chatbots on Windows using TensorRT-LLM
☆3,032Updated 4 months ago
Alternatives and similar repositories for ChatRTX
Users that are interested in ChatRTX are comparing it to the libraries listed below
Sorting:
- Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.☆3,340Updated last week
- ☆1,539Updated 3 weeks ago
- A fast inference library for running LLMs locally on modern consumer-class GPUs☆4,271Updated last month
- ☆993Updated 6 months ago
- Local AI API Platform☆2,762Updated last month
- Generative AI extensions for onnxruntime☆797Updated this week
- TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizati…☆11,365Updated this week
- ☆3,003Updated 11 months ago
- Official Code for Stable Cascade☆6,592Updated last year
- ⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Pl…☆2,170Updated 10 months ago
- This is a Phi Family of SLMs book for getting started with Phi Models. Phi a family of open sourced AI models developed by Microsoft. Phi…☆3,476Updated this week
- A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.☆2,895Updated last year
- Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.☆3,079Updated last week
- Tools for merging pretrained large language models.☆6,195Updated this week
- Large-scale LLM inference engine☆1,513Updated this week
- Olive: Simplify ML Model Finetuning, Conversion, Quantization, and Optimization for CPUs, GPUs and NPUs.☆2,054Updated this week
- Home of StarCoder2!☆1,960Updated last year
- ☆1,555Updated last year
- Large World Model -- Modeling Text and Video with Millions Context☆7,324Updated 10 months ago
- Official implementation for the paper: "Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering""☆3,881Updated 8 months ago
- LM Studio CLI☆3,553Updated this week
- Training LLMs with QLoRA + FSDP☆1,526Updated 9 months ago
- Yes, it's another chat over documents implementation... but this one is entirely local!☆1,787Updated 4 months ago
- Foundational model for human-like, expressive TTS☆4,146Updated last year
- An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.☆4,922Updated 4 months ago
- Official Pytorch repository for Extreme Compression of Large Language Models via Additive Quantization https://arxiv.org/pdf/2401.06118.p…☆1,285Updated last week
- Run Mixtral-8x7B models in Colab or consumer desktops☆2,317Updated last year
- [ICLR 2024] Efficient Streaming Language Models with Attention Sinks☆7,009Updated last year
- A collection of standardized JSON descriptors for Large Language Model (LLM) files.☆801Updated last year
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.☆12,640Updated this week