Running Llama 2 and other Open-Source LLMs on CPU Inference Locally for Document Q&A
☆974Nov 6, 2023Updated 2 years ago
Alternatives and similar repositories for Llama-2-Open-Source-LLM-CPU-Inference
Users that are interested in Llama-2-Open-Source-LLM-CPU-Inference are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Use `llama2-wrapper` as your local llama2 backend…☆1,942Mar 22, 2024Updated 2 years ago
- LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transform…☆1,465Nov 7, 2023Updated 2 years ago
- Run inference on MPT-30B using CPU☆575Jun 30, 2023Updated 2 years ago
- prompt2model - Generate Deployable Models from Natural Language Instructions☆2,015Dec 29, 2024Updated last year
- 开源社区第一个能下载、能运行的中文 LLaMA2 模型!☆2,208Oct 26, 2023Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- An implementation of "Retentive Network: A Successor to Transformer for Large Language Models"☆1,215Oct 22, 2023Updated 2 years ago
- Chat with your documents on your local device using GPT models. No data leaves your device and 100% private.☆22,212Mar 10, 2026Updated 2 months ago
- 🚀🎬 ShortGPT - Experimental AI framework for youtube shorts / tiktok channel automation☆7,353Feb 10, 2025Updated last year
- An Open-source Toolkit for LLM Development☆2,805Jan 13, 2025Updated last year
- Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. D…☆11,987Oct 9, 2025Updated 7 months ago
- CodeTF: One-stop Transformer Library for State-of-the-art Code LLM☆1,479May 1, 2025Updated last year
- Python package for easily interfacing with chat apps, with robust features and minimal code complexity.☆3,503Jul 3, 2024Updated last year
- LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath☆9,480Jun 7, 2025Updated 11 months ago
- LLaMA v2 Chatbot☆1,412Aug 27, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Inference Llama 2 in one file of pure C☆19,548Aug 6, 2024Updated last year
- Large Language Model Text Generation Inference☆10,856Mar 21, 2026Updated 2 months ago
- Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)☆12,870Apr 13, 2026Updated last month
- 🤖 Deploy a private ChatGPT alternative hosted within your VPC. 🔮 Connect it to your organization's knowledge base and use it as a corpo…☆1,588Sep 11, 2023Updated 2 years ago
- [ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.☆5,647May 21, 2025Updated last year
- TypeChat is a library that makes it easy to build natural language interfaces using types.☆8,655May 8, 2026Updated 2 weeks ago
- Superagent protects your AI applications against prompt injections, data leaks, and harmful outputs. Embed safety directly into your app …☆6,613Apr 11, 2026Updated last month
- Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We als…☆18,331May 19, 2026Updated last week
- 👾 Open source implementation of the ChatGPT Code Interpreter☆3,846Nov 7, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- The no-code platform for building custom LLM Agents☆2,941Jun 17, 2024Updated last year
- Official Pytorch Implementation for "TokenFlow: Consistent Diffusion Features for Consistent Video Editing" presenting "TokenFlow" (ICLR …☆1,711Feb 3, 2025Updated last year
- An Open-source Framework for Data-centric, Self-evolving Autonomous Language Agents☆5,928Sep 26, 2024Updated last year
- OpenChat: Advancing Open-source Language Models with Imperfect Data☆5,486Sep 13, 2024Updated last year
- Universal LLM Deployment Engine with ML Compilation☆22,687May 11, 2026Updated 2 weeks ago
- ☆2,556Jan 7, 2025Updated last year
- 🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading☆10,149Sep 7, 2024Updated last year
- Explore large language models in 512MB of RAM☆1,194Feb 19, 2026Updated 3 months ago
- ☆1,062May 29, 2023Updated 2 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.☆8,963May 3, 2024Updated 2 years ago
- Official Implementation of "Graph of Thoughts: Solving Elaborate Problems with Large Language Models"☆2,661Mar 24, 2026Updated 2 months ago
- 🎙️🤖Create, Customize and Talk to your AI Character/Companion in Realtime (All in One Codebase!). Have a natural seamless conversation w…☆6,209Jan 20, 2026Updated 4 months ago
- Awesome things you can do with ChatGPT + Code Interpreter combo 🔥☆1,015Dec 10, 2023Updated 2 years ago
- AI companions with memory: a lightweight stack to create and host your own AI companions☆5,952Apr 23, 2024Updated 2 years ago
- A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.☆2,922Sep 30, 2023Updated 2 years ago
- ☆1,027Jan 4, 2024Updated 2 years ago