Run inference on MPT-30B using CPU
☆576Jun 30, 2023Updated 2 years ago
Alternatives and similar repositories for mpt-30B-inference
Users that are interested in mpt-30B-inference are comparing it to the libraries listed below
Sorting:
- Running Llama 2 and other Open-Source LLMs on CPU Inference Locally for Document Q&A☆974Nov 6, 2023Updated 2 years ago
- Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Use `llama2-wrapper` as your local llama2 backend…☆1,944Mar 22, 2024Updated last year
- Python bindings for the Transformer models implemented in C/C++ using GGML library.☆1,882Jan 28, 2024Updated 2 years ago
- CodeTF: One-stop Transformer Library for State-of-the-art Code LLM☆1,481May 1, 2025Updated 10 months ago
- Chat with your data privately using MPT-30b☆184Jun 29, 2023Updated 2 years ago
- LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath☆9,478Jun 7, 2025Updated 9 months ago
- ☆2,559Jan 7, 2025Updated last year
- LLM as a Chatbot Service☆3,332Nov 20, 2023Updated 2 years ago
- Run inference on replit-3B code instruct model using CPU☆160Jul 5, 2023Updated 2 years ago
- A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.☆2,913Sep 30, 2023Updated 2 years ago
- Locally hosted tool that connects documents to LLMs for summarization and querying, with a simple GUI.☆798Aug 1, 2023Updated 2 years ago
- An Open-source Toolkit for LLM Development☆2,805Jan 13, 2025Updated last year
- Salesforce open-source LLMs with 8k sequence length.☆725Jan 31, 2025Updated last year
- Cross-Platform, GPU Accelerated Whisper 🏎️☆1,804Feb 27, 2024Updated 2 years ago
- Large Language Model Text Generation Inference☆10,795Jan 8, 2026Updated last month
- Run evaluation on LLMs using human-eval benchmark☆427Sep 12, 2023Updated 2 years ago
- Scale LLM Engine public repository☆820Updated this week
- LLMs custom-chatbots console ⚡☆5,263Feb 27, 2024Updated 2 years ago
- H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs. Documentation: https://docs.h2o.ai/h2o-llmstudio/☆4,891Feb 20, 2026Updated 2 weeks ago
- OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset☆7,533Jul 16, 2023Updated 2 years ago
- LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transform…☆1,464Nov 7, 2023Updated 2 years ago
- This is the official code for MobileSAM project that makes SAM lightweight for mobile applications and beyond!☆5,634Dec 19, 2025Updated 2 months ago
- prompt2model - Generate Deployable Models from Natural Language Instructions☆2,009Dec 29, 2024Updated last year
- LLaMA v2 Chatbot☆1,415Aug 27, 2023Updated 2 years ago
- FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensible toolkit for large-scale model.☆3,881Nov 11, 2025Updated 3 months ago
- Plug in and Play Implementation of Tree of Thoughts: Deliberate Problem Solving with Large Language Models that Elevates Model Reasoning …☆4,570Jul 29, 2025Updated 7 months ago
- A ChatGPT plugin that allows you to load and edit your local files in a controlled way, as well as run any Python, JavaScript, and bash s…☆1,196Aug 31, 2023Updated 2 years ago
- Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)☆12,734Feb 9, 2026Updated 3 weeks ago
- Explore large language models in 512MB of RAM☆1,198Feb 19, 2026Updated 2 weeks ago
- An open source implementation of OpenAI's ChatGPT Code interpreter☆3,574Mar 20, 2024Updated last year
- Implementation of plug in and play Attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens"☆714Jan 7, 2024Updated 2 years ago
- 🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading☆9,971Sep 7, 2024Updated last year
- [ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters☆5,933Mar 14, 2024Updated last year
- ☆1,058May 29, 2023Updated 2 years ago
- OpenChat: Advancing Open-source Language Models with Imperfect Data☆5,475Sep 13, 2024Updated last year
- ⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Pl…☆2,175Oct 8, 2024Updated last year
- QLoRA: Efficient Finetuning of Quantized LLMs☆10,843Jun 10, 2024Updated last year
- Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.☆12,148Updated this week
- Sparsity-aware deep learning inference runtime for CPUs☆3,163Jun 2, 2025Updated 9 months ago