Running Llama 2 and other Open-Source LLMs on CPU Inference Locally for Document Q&A
☆973Nov 6, 2023Updated 2 years ago
Alternatives and similar repositories for Llama-2-Open-Source-LLM-CPU-Inference
Users that are interested in Llama-2-Open-Source-LLM-CPU-Inference are comparing it to the libraries listed below
Sorting:
- Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Use `llama2-wrapper` as your local llama2 backend…☆1,944Mar 22, 2024Updated last year
- LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transform…☆1,463Nov 7, 2023Updated 2 years ago
- Run inference on MPT-30B using CPU☆576Jun 30, 2023Updated 2 years ago
- prompt2model - Generate Deployable Models from Natural Language Instructions☆2,009Dec 29, 2024Updated last year
- CodeTF: One-stop Transformer Library for State-of-the-art Code LLM☆1,481May 1, 2025Updated 10 months ago
- An Open-source Toolkit for LLM Development☆2,805Jan 13, 2025Updated last year
- Python package for easily interfacing with chat apps, with robust features and minimal code complexity.☆3,514Jul 3, 2024Updated last year
- LLaMA v2 Chatbot☆1,415Aug 27, 2023Updated 2 years ago
- 🚀🎬 ShortGPT - Experimental AI framework for youtube shorts / tiktok channel automation☆7,122Feb 10, 2025Updated last year
- An implementation of "Retentive Network: A Successor to Transformer for Large Language Models"☆1,212Oct 22, 2023Updated 2 years ago
- Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. D…☆12,012Oct 9, 2025Updated 4 months ago
- LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath☆9,477Jun 7, 2025Updated 8 months ago
- 开源社区第一个能下载、能运行的中文 LLaMA2 模型!☆2,223Oct 26, 2023Updated 2 years ago
- Chat with your documents on your local device using GPT models. No data leaves your device and 100% private.☆22,194Updated this week
- [ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.☆5,536May 21, 2025Updated 9 months ago
- 👾 Open source implementation of the ChatGPT Code Interpreter☆3,860Nov 7, 2024Updated last year
- Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)☆12,734Feb 9, 2026Updated 3 weeks ago
- ☆2,559Jan 7, 2025Updated last year
- Superagent protects your AI applications against prompt injections, data leaks, and harmful outputs. Embed safety directly into your app …☆6,422Feb 3, 2026Updated last month
- 🤖 Deploy a private ChatGPT alternative hosted within your VPC. 🔮 Connect it to your organization's knowledge base and use it as a corpo…☆1,506Sep 11, 2023Updated 2 years ago
- An Open-source Framework for Data-centric, Self-evolving Autonomous Language Agents☆5,876Sep 26, 2024Updated last year
- The no-code platform for building custom LLM Agents☆2,943Jun 17, 2024Updated last year
- Inference Llama 2 in one file of pure C☆19,213Aug 6, 2024Updated last year
- Large Language Model Text Generation Inference☆10,788Jan 8, 2026Updated last month
- Explore large language models in 512MB of RAM☆1,198Feb 19, 2026Updated 2 weeks ago
- kani (カニ) is a highly hackable microframework for tool-calling language models. (NLP-OSS @ EMNLP 2023)☆599Updated this week
- OpenChat: Advancing Open-source Language Models with Imperfect Data☆5,475Sep 13, 2024Updated last year
- ☆1,058May 29, 2023Updated 2 years ago
- Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We als…☆18,220Nov 3, 2025Updated 4 months ago
- Open Source AI Platform - AI Chat with advanced features that works with every LLM☆17,687Updated this week
- Official Implementation of "Graph of Thoughts: Solving Elaborate Problems with Large Language Models"☆2,607Dec 11, 2024Updated last year
- Awesome things you can do with ChatGPT + Code Interpreter combo 🔥☆1,017Dec 10, 2023Updated 2 years ago
- 🎙️🤖Create, Customize and Talk to your AI Character/Companion in Realtime (All in One Codebase!). Have a natural seamless conversation w…☆6,201Jan 20, 2026Updated last month
- Official Pytorch Implementation for "TokenFlow: Consistent Diffusion Features for Consistent Video Editing" presenting "TokenFlow" (ICLR …☆1,707Feb 3, 2025Updated last year
- 🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading☆9,971Sep 7, 2024Updated last year
- AgentTuning: Enabling Generalized Agent Abilities for LLMs☆1,479Oct 31, 2023Updated 2 years ago
- Universal LLM Deployment Engine with ML Compilation☆22,082Updated this week
- The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.☆8,896May 3, 2024Updated last year
- A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.☆2,913Sep 30, 2023Updated 2 years ago