NVIDIA / ChatRTXLinks
A developer reference project for creating Retrieval Augmented Generation (RAG) chatbots on Windows using TensorRT-LLM
☆3,106Updated last week
Alternatives and similar repositories for ChatRTX
Users that are interested in ChatRTX are comparing it to the libraries listed below
Sorting:
- A fast inference library for running LLMs locally on modern consumer-class GPUs☆4,431Updated last month
- ⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Pl…☆2,173Updated last year
- ☆1,027Updated 11 months ago
- Home of StarCoder2!☆2,034Updated last year
- Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.☆3,154Updated this week
- Local AI API Platform☆2,761Updated 6 months ago
- Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.☆3,739Updated 2 weeks ago
- PyTorch native post-training library☆5,654Updated this week
- Run Mixtral-8x7B models in Colab or consumer desktops☆2,326Updated last year
- ☆3,070Updated 2 months ago
- ☆1,548Updated last year
- Yes, it's another chat over documents implementation... but this one is entirely local!☆1,818Updated last month
- Generative AI extensions for onnxruntime☆944Updated this week
- A collection of standardized JSON descriptors for Large Language Model (LLM) files.☆798Updated last year
- Official Pytorch repository for Extreme Compression of Large Language Models via Additive Quantization https://arxiv.org/pdf/2401.06118.p…☆1,313Updated 5 months ago
- Foundational model for human-like, expressive TTS☆4,192Updated last year
- TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizat…☆12,755Updated this week
- The first open-source Artificial Narrow Intelligence generalist agentic framework Computer-Using-Agent that fully operates graphical-user…☆1,321Updated 11 months ago
- ☆1,889Updated last week
- Olive: Simplify ML Model Finetuning, Conversion, Quantization, and Optimization for CPUs, GPUs and NPUs.☆2,246Updated this week
- Reaching LLaMA2 Performance with 0.1M Dollars☆987Updated last year
- Llama-3 agents that can browse the web by following instructions and talking to you☆1,406Updated last year
- Training LLMs with QLoRA + FSDP☆1,537Updated last year
- lightweight, standalone C++ inference engine for Google's Gemma models.☆6,714Updated last week
- LLocalSearch is a completely locally running search aggregator using LLM Agents. The user can ask a question and the system will use a ch…☆5,964Updated last month
- Inference Llama 2 in one file of pure 🔥☆2,116Updated 2 months ago
- On-device AI across mobile, embedded and edge for PyTorch☆4,193Updated this week
- Accelerate your Hugging Face Transformers 7.6-9x. Native to Hugging Face and PyTorch.☆685Updated last year
- Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.☆6,180Updated 5 months ago
- Benchmark LLMs by fighting in Street Fighter 3! The new way to evaluate the quality of an LLM☆1,463Updated 10 months ago