NVIDIA / ChatRTX
A developer reference project for creating Retrieval Augmented Generation (RAG) chatbots on Windows using TensorRT-LLM
☆2,666Updated last month
Related projects: ⓘ
- tiny vision language model☆4,893Updated 3 weeks ago
- TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain…☆8,186Updated last week
- Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.☆2,142Updated this week
- Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.☆2,745Updated this week
- Home of StarCoder2!☆1,718Updated 6 months ago
- To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x com…☆4,446Updated last month
- ☆7,075Updated last month
- A Native-PyTorch Library for LLM Fine-tuning☆3,954Updated this week
- Python SDK, Proxy Server to call 100+ LLM APIs using the OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker,…☆12,291Updated this week
- A fast inference library for running LLMs locally on modern consumer-class GPUs☆3,493Updated this week
- Tools for merging pretrained large language models.☆4,501Updated this week
- Together Mixture-Of-Agents (MoA) – 65.1% on AlpacaEval with OSS models☆2,538Updated last month
- Go ahead and axolotl questions☆7,554Updated last week
- Inference and training library for high-quality TTS models.☆4,220Updated last month
- Code examples and resources for DBRX, a large language model developed by Databricks☆2,496Updated 4 months ago
- Build AI Assistants with memory, knowledge and tools.☆11,156Updated this week
- Retrieval Augmented Generation (RAG) chatbot powered by Weaviate☆6,008Updated last week
- Blazingly fast LLM inference.☆3,429Updated this week
- Finetune Llama 3.1, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory☆15,611Updated this week
- Zero-Shot Speech Editing and Text-to-Speech in the Wild☆7,459Updated 2 months ago
- SGLang is a fast serving framework for large language models and vision language models.☆5,162Updated this week
- Ollama Python library☆3,912Updated last week
- Large Language Model Text Generation Inference☆8,778Updated this week
- LM Studio CLI☆1,410Updated last week
- Build resilient language agents as graphs.☆5,662Updated this week
- Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚☆6,777Updated last week
- Modeling, training, eval, and inference code for OLMo☆4,406Updated this week
- AIOS: LLM Agent Operating System☆3,219Updated last week
- Open source codebase powering the HuggingChat app☆7,214Updated this week
- Foundational model for human-like, expressive TTS☆3,721Updated last month