pytorch / torchchat
Run PyTorch LLMs locally on servers, desktop and mobile
☆3,100Updated this week
Related projects: ⓘ
- Lightning-fast serving engine for AI models. Flexible. Easy. Enterprise-scale.☆2,055Updated this week
- A Native-PyTorch Library for LLM Fine-tuning☆3,942Updated this week
- ☆2,652Updated this week
- Speech To Speech: an effort for an open-sourced and modular GPT4-o☆2,989Updated last week
- Together Mixture-Of-Agents (MoA) – 65.1% on AlpacaEval with OSS models☆2,531Updated last month
- nanoGPT style version of Llama 3.1☆1,162Updated last month
- A framework for serving and evaluating LLM routers - save LLM costs without compromising quality!☆2,884Updated last month
- RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry☆3,155Updated this week
- The easiest way to use Agentic RAG in any enterprise☆3,132Updated this week
- Ollama Python library☆3,912Updated this week
- Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.☆2,739Updated last week
- 🔍 An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)☆4,578Updated last week
- tiny vision language model☆4,893Updated 3 weeks ago
- This is a Phi-3 book for getting started with Phi-3. Phi-3, a family of open AI models developed by Microsoft. Phi-3 models are the most …☆2,227Updated 2 weeks ago
- Transformer Explained Visually: Learn How LLM Transformer Models Work with Interactive Visualization☆2,580Updated 2 weeks ago
- Efficient Triton Kernels for LLM Training☆2,911Updated this week
- High-quality datasets, tools, and concepts for LLM fine-tuning.☆1,664Updated last month
- Blazingly fast LLM inference.☆3,406Updated this week
- On-device AI across mobile, embedded and edge for PyTorch☆1,698Updated this week
- Agentic components of the Llama Stack APIs☆3,222Updated this week
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.☆9,780Updated this week
- g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains☆869Updated this week
- A collection of notebooks/recipes showcasing some fun and effective ways of using Claude.☆4,725Updated this week
- Tools for merging pretrained large language models.☆4,501Updated this week
- Retrieval Augmented Generation (RAG) chatbot powered by Weaviate☆6,008Updated this week
- AIOS: LLM Agent Operating System☆3,219Updated this week
- The n-gram Language Model☆1,288Updated last month
- lightweight, standalone C++ inference engine for Google's Gemma models.☆5,911Updated this week
- SGLang is a fast serving framework for large language models and vision language models.☆5,121Updated this week
- Modeling, training, eval, and inference code for OLMo☆4,399Updated this week