Running Llama 2 and other Open-Source LLMs on CPU Inference Locally for Document Q&A
☆973Nov 6, 2023Updated 2 years ago
Alternatives and similar repositories for Llama-2-Open-Source-LLM-CPU-Inference
Users that are interested in Llama-2-Open-Source-LLM-CPU-Inference are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Use `llama2-wrapper` as your local llama2 backend…☆1,938Mar 22, 2024Updated 2 years ago
- LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transform…☆1,465Nov 7, 2023Updated 2 years ago
- Run inference on MPT-30B using CPU☆575Jun 30, 2023Updated 2 years ago
- prompt2model - Generate Deployable Models from Natural Language Instructions☆2,014Dec 29, 2024Updated last year
- 开源社区第一个能下载、能运行的中文 LLaMA2 模型!☆2,206Oct 26, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- An implementation of "Retentive Network: A Successor to Transformer for Large Language Models"☆1,215Oct 22, 2023Updated 2 years ago
- Chat with your documents on your local device using GPT models. No data leaves your device and 100% private.☆22,217Mar 10, 2026Updated 3 months ago
- 🚀🎬 ShortGPT - Experimental AI framework for youtube shorts / tiktok channel automation☆7,406Feb 10, 2025Updated last year
- An Open-source Toolkit for LLM Development☆2,805Jan 13, 2025Updated last year
- Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. D…☆11,985Oct 9, 2025Updated 8 months ago
- CodeTF: One-stop Transformer Library for State-of-the-art Code LLM☆1,477May 1, 2025Updated last year
- Python package for easily interfacing with chat apps, with robust features and minimal code complexity.☆3,500Jul 3, 2024Updated last year
- LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath☆9,486Jun 7, 2025Updated last year
- LLaMA v2 Chatbot☆1,412Aug 27, 2023Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Inference Llama 2 in one file of pure C☆19,631Aug 6, 2024Updated last year
- Large Language Model Text Generation Inference☆10,863Mar 21, 2026Updated 2 months ago
- Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)☆12,903Apr 13, 2026Updated 2 months ago
- 🤖 Deploy a private ChatGPT alternative hosted within your VPC. 🔮 Connect it to your organization's knowledge base and use it as a corpo…☆1,589Sep 11, 2023Updated 2 years ago
- [ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.☆5,665May 21, 2025Updated last year
- TypeChat is a library that makes it easy to build natural language interfaces using types.☆8,665Jun 3, 2026Updated last week
- Superagent protects your AI applications against prompt injections, data leaks, and harmful outputs. Embed safety directly into your app …☆6,628Apr 11, 2026Updated 2 months ago
- Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We als…☆18,351May 19, 2026Updated 3 weeks ago
- 👾 Open source implementation of the ChatGPT Code Interpreter☆3,845Nov 7, 2024Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- The no-code platform for building custom LLM Agents☆2,959Jun 17, 2024Updated last year
- Official Pytorch Implementation for "TokenFlow: Consistent Diffusion Features for Consistent Video Editing" presenting "TokenFlow" (ICLR …☆1,709Feb 3, 2025Updated last year
- An Open-source Framework for Data-centric, Self-evolving Autonomous Language Agents☆5,933Sep 26, 2024Updated last year
- OpenChat: Advancing Open-source Language Models with Imperfect Data☆5,483Sep 13, 2024Updated last year
- Universal LLM Deployment Engine with ML Compilation☆22,792May 11, 2026Updated last month
- ☆2,555Jan 7, 2025Updated last year
- 🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading☆10,199Sep 7, 2024Updated last year
- Explore large language models in 512MB of RAM☆1,192Feb 19, 2026Updated 3 months ago
- ☆1,061May 29, 2023Updated 3 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.☆8,985May 3, 2024Updated 2 years ago
- Official Implementation of "Graph of Thoughts: Solving Elaborate Problems with Large Language Models"☆2,667Mar 24, 2026Updated 2 months ago
- 🎙️🤖Create, Customize and Talk to your AI Character/Companion in Realtime (All in One Codebase!). Have a natural seamless conversation w…☆6,210Jan 20, 2026Updated 4 months ago
- Awesome things you can do with ChatGPT + Code Interpreter combo 🔥☆1,015Dec 10, 2023Updated 2 years ago
- AI companions with memory: a lightweight stack to create and host your own AI companions☆5,961Apr 23, 2024Updated 2 years ago
- A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.☆2,924Sep 30, 2023Updated 2 years ago
- ☆1,027Jan 4, 2024Updated 2 years ago