NVIDIA / ChatRTXLinks
A developer reference project for creating Retrieval Augmented Generation (RAG) chatbots on Windows using TensorRT-LLM
☆3,022Updated 3 months ago
Alternatives and similar repositories for ChatRTX
Users that are interested in ChatRTX are comparing it to the libraries listed below
Sorting:
- Run Mixtral-8x7B models in Colab or consumer desktops☆2,316Updated last year
- ⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Pl…☆2,168Updated 9 months ago
- A fast inference library for running LLMs locally on modern consumer-class GPUs☆4,245Updated 2 weeks ago
- TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizati…☆11,125Updated this week
- [EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which ach…☆5,300Updated 4 months ago
- ☆988Updated 5 months ago
- High-speed Large Language Model Serving for Local Deployment☆8,240Updated 5 months ago
- Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.☆3,068Updated last week
- Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.☆3,295Updated last week
- Llama-3 agents that can browse the web by following instructions and talking to you☆1,411Updated 7 months ago
- Home of StarCoder2!☆1,942Updated last year
- Large-scale LLM inference engine☆1,482Updated last week
- Connect home devices into a powerful cluster to accelerate LLM inference. More devices means faster inference.☆2,231Updated this week
- Inference Llama 2 in one file of pure 🔥☆2,115Updated last year
- ☆1,511Updated this week
- A collection of standardized JSON descriptors for Large Language Model (LLM) files.☆798Updated 11 months ago
- Official inference library for Mistral models☆10,377Updated 4 months ago
- PyTorch native post-training library☆5,361Updated last week
- OpenChat: Advancing Open-source Language Models with Imperfect Data☆5,386Updated 10 months ago
- LLocalSearch is a completely locally running search aggregator using LLM Agents. The user can ask a question and the system will use a ch…☆5,941Updated 3 months ago
- Training LLMs with QLoRA + FSDP☆1,524Updated 8 months ago
- Local AI API Platform☆2,765Updated 3 weeks ago
- [ICML'24] Magicoder: Empowering Code Generation with OSS-Instruct☆2,019Updated 8 months ago
- Official Pytorch repository for Extreme Compression of Large Language Models via Additive Quantization https://arxiv.org/pdf/2401.06118.p…☆1,277Updated 2 months ago
- Go ahead and axolotl questions☆10,038Updated this week
- Granite Code Models: A Family of Open Foundation Models for Code Intelligence☆1,220Updated last month
- Large Language Model Text Generation Inference☆10,367Updated last week
- This is a Phi Family of SLMs book for getting started with Phi Models. Phi a family of open sourced AI models developed by Microsoft. Phi…☆3,456Updated last week
- [ICLR 2024] Efficient Streaming Language Models with Attention Sinks☆6,943Updated last year
- Multiple NVIDIA GPUs or Apple Silicon for Large Language Model Inference?☆1,708Updated last year