nmntz/bloomz.cpp

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/nmntz/bloomz.cpp)

nmntz / bloomz.cpp

C++ implementation for BLOOM

☆812

Alternatives and similar repositories for bloomz.cpp

Users that are interested in bloomz.cpp are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

antimatter15 / alpaca.cpp
View on GitHub
Locally run an Instruction-Tuned Chat-Style LLM
☆10,138Apr 19, 2023Updated 3 years ago
NolanoOrg / cformers
View on GitHub
SoTA Transformers with C-backend for fast inference on your CPU.
☆312Dec 9, 2023Updated 2 years ago
RWKV / rwkv.cpp
View on GitHub
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
☆1,574Mar 23, 2025Updated last year
qwopqwop200 / GPTQ-for-LLaMa
View on GitHub
4 bits quantization of LLaMA using GPTQ
☆3,073Jul 13, 2024Updated last year
rustformers / llm
View on GitHub
[Unmaintained, see README] An ecosystem of Rust libraries for working with large language models
☆6,154Jun 24, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
tloen / alpaca-lora
View on GitHub
Instruct-tune LLaMA on consumer hardware
☆18,914Jul 29, 2024Updated last year
ggml-org / ggml
View on GitHub
Tensor library for machine learning
☆14,935Jun 26, 2026Updated last week
openlm-research / open_llama
View on GitHub
OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset
☆7,529Jul 16, 2023Updated 2 years ago
marella / ctransformers
View on GitHub
Python bindings for the Transformer models implemented in C/C++ using GGML library.
☆1,886Jan 28, 2024Updated 2 years ago
skeskinen / bert.cpp
View on GitHub
ggml implementation of BERT
☆501Feb 23, 2024Updated 2 years ago
Lightning-AI / lit-llama
View on GitHub
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Ad…
☆6,085Jul 1, 2025Updated last year
thomasantony / llamacpp-python
View on GitHub
Python bindings for llama.cpp
☆199Apr 22, 2023Updated 3 years ago
FMInference / FlexLLMGen
View on GitHub
Running large language models on a single GPU for throughput-oriented scenarios.
☆9,362Oct 28, 2024Updated last year
bigscience-workshop / petals
View on GitHub
🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
☆10,248Sep 7, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
tatsu-lab / stanford_alpaca
View on GitHub
Code and documentation to train Stanford's Alpaca models, and generate the data.
☆30,248Jul 17, 2024Updated last year
BlinkDL / RWKV-LM
View on GitHub
RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable)…
☆14,585Jun 13, 2026Updated 3 weeks ago
monatis / clip.cpp
View on GitHub
CLIP inference in plain C/C++ with no extra dependencies
☆563Jun 19, 2025Updated last year
mlc-ai / web-stable-diffusion
View on GitHub
Bringing stable diffusion models to web browsers. Everything runs inside the browser with no server support.
☆3,719Mar 12, 2024Updated 2 years ago
turboderp / exllama
View on GitHub
A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
☆2,929Sep 30, 2023Updated 2 years ago
BlinkDL / ChatRWKV
View on GitHub
ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.
☆9,490May 29, 2026Updated last month
ggml-org / whisper.cpp
View on GitHub
Port of OpenAI's Whisper model in C/C++
☆51,240Updated this week
cocktailpeanut / dalai
View on GitHub
The simplest way to run LLaMA on your local machine
☆12,920Jun 18, 2024Updated 2 years ago
salesforce / CodeGen
View on GitHub
CodeGen is a family of open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.
☆5,178Jun 2, 2026Updated last month
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
ggml-org / llama.cpp
View on GitHub
LLM inference in C/C++
☆118,422Updated this week
gururise / AlpacaDataCleaned
View on GitHub
Alpaca dataset from Stanford, cleaned and curated
☆1,605Mar 7, 2026Updated 3 months ago
cmp-nct / ggllm.cpp
View on GitHub
Falcon LLM ggml framework with CPU and GPU support
☆249Jan 22, 2024Updated 2 years ago
turboderp-org / exllamav2
View on GitHub
A fast inference library for running LLMs locally on modern consumer-class GPUs
☆4,578Mar 4, 2026Updated 4 months ago
PABannier / biogpt.cpp
View on GitHub
Port of Microsoft's BioGPT in C/C++ using ggml
☆87Feb 21, 2024Updated 2 years ago
lm-sys / FastChat
View on GitHub
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
☆39,492May 1, 2026Updated 2 months ago
deep-diver / LLM-As-Chatbot
View on GitHub
LLM as a Chatbot Service
☆3,323Nov 20, 2023Updated 2 years ago
Stability-AI / StableLM
View on GitHub
StableLM: Stability AI Language Models
☆15,696Apr 8, 2024Updated 2 years ago
intel / intel-extension-for-transformers
View on GitHub
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Pl…
☆2,176Oct 8, 2024Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
linhduongtuan / BLOOM-LORA
View on GitHub
Due to restriction of LLaMA, we try to reimplement BLOOM-LoRA (much less restricted BLOOM license here https://huggingface.co/spaces/bigs…
☆183Jun 18, 2023Updated 3 years ago
huggingface / text-generation-inference
View on GitHub
Large Language Model Text Generation Inference
☆10,864Mar 21, 2026Updated 3 months ago
ravenscroftj / turbopilot
View on GitHub
Turbopilot is an open source large-language-model based code completion engine that runs locally on CPU
☆3,786Sep 30, 2023Updated 2 years ago
OpenGVLab / LLaMA-Adapter
View on GitHub
[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters
☆5,920Mar 14, 2024Updated 2 years ago
nlpxucan / WizardLM
View on GitHub
LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath
☆9,483Jun 7, 2025Updated last year
mlc-ai / mlc-llm
View on GitHub
Universal LLM Deployment Engine with ML Compilation
☆22,901Updated this week
artidoro / qlora
View on GitHub
QLoRA: Efficient Finetuning of Quantized LLMs
☆10,940Jun 10, 2024Updated 2 years ago