NVIDIA / ChatRTXLinks
A developer reference project for creating Retrieval Augmented Generation (RAG) chatbots on Windows using TensorRT-LLM
☆3,068Updated 6 months ago
Alternatives and similar repositories for ChatRTX
Users that are interested in ChatRTX are comparing it to the libraries listed below
Sorting:
- ☆1,006Updated 8 months ago
- Run Mixtral-8x7B models in Colab or consumer desktops☆2,325Updated last year
- ☆3,031Updated last year
- A fast inference library for running LLMs locally on modern consumer-class GPUs☆4,341Updated 2 months ago
- ⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Pl…☆2,163Updated last year
- Home of StarCoder2!☆1,979Updated last year
- TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizati…☆11,880Updated this week
- tiny vision language model☆8,814Updated 3 weeks ago
- Yes, it's another chat over documents implementation... but this one is entirely local!☆1,795Updated 6 months ago
- Run PyTorch LLMs locally on servers, desktop and mobile☆3,615Updated last month
- [ICML'24] Magicoder: Empowering Code Generation with OSS-Instruct☆2,049Updated 11 months ago
- Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.☆3,483Updated 2 weeks ago
- A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.☆2,903Updated 2 years ago
- Simple HTML UI for Ollama☆1,088Updated last month
- Benchmark LLMs by fighting in Street Fighter 3! The new way to evaluate the quality of an LLM☆1,447Updated 6 months ago
- Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.☆3,119Updated last week
- On-device AI across mobile, embedded and edge for PyTorch☆3,310Updated this week
- Official inference library for Mistral models☆10,506Updated 7 months ago
- Large Language Model Text Generation Inference☆10,566Updated last month
- Python bindings for llama.cpp☆9,658Updated 2 months ago
- ☆1,552Updated last year
- lightweight, standalone C++ inference engine for Google's Gemma models.☆6,591Updated this week
- Chatbot Ollama is an open source chat UI for Ollama.☆1,799Updated last month
- Official Pytorch repository for Extreme Compression of Large Language Models via Additive Quantization https://arxiv.org/pdf/2401.06118.p…☆1,296Updated 2 months ago
- Tools for merging pretrained large language models.☆6,378Updated last month
- The first open-source Artificial Narrow Intelligence generalist agentic framework Computer-Using-Agent that fully operates graphical-user…☆1,309Updated 8 months ago
- Training LLMs with QLoRA + FSDP☆1,527Updated 11 months ago
- Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.☆3,958Updated 9 months ago
- Build and run containers leveraging NVIDIA GPUs☆3,739Updated this week
- Large-scale LLM inference engine☆1,567Updated last week