NolanoOrg/cformers

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/NolanoOrg/cformers)

NolanoOrg / cformers

SoTA Transformers with C-backend for fast inference on your CPU.

☆312

Alternatives and similar repositories for cformers

Users that are interested in cformers are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

NolanoOrg / smol-gpt
View on GitHub
Smol but mighty language model
☆65Apr 4, 2023Updated 3 years ago
NolanoOrg / sparse_quant_llms
View on GitHub
SparseGPT + GPTQ Compression of LLMs like LLaMa, OPT, Pythia
☆42Mar 13, 2023Updated 3 years ago
jploski / ggml
View on GitHub
Falcon7B + Falcon40B support - in branch falcon40b. Now all good and working. But main action now in https://github.com/cmp-nct/ggllm.cpp
☆10Sep 30, 2023Updated 2 years ago
NolanoOrg / llama-int4-quant
View on GitHub
☆26Mar 11, 2023Updated 3 years ago
nmntz / bloomz.cpp
View on GitHub
C++ implementation for BLOOM
☆811May 13, 2023Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
virtualzx-nad / easy_llm_agents
View on GitHub
Flexible Python package for managing and extending LLM based agents
☆24May 14, 2023Updated 3 years ago
PotatoSpudowski / fastLLaMa
View on GitHub
fastLLaMa: An experimental high-performance framework for running Decoder-only LLMs with 4-bit quantization in Python using a C/C++ backe…
☆413Jun 2, 2023Updated 3 years ago
RWKV / rwkv.cpp
View on GitHub
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
☆1,579Mar 23, 2025Updated last year
NolanoOrg / InstructLLaMa.cpp
View on GitHub
Fast inference of Instruct tuned LLaMa on your personal devices.
☆23Mar 16, 2023Updated 3 years ago
lachlansneff / sparsellama
View on GitHub
☆40Mar 25, 2023Updated 3 years ago
qwopqwop200 / GPTQ-for-LLaMa
View on GitHub
4 bits quantization of LLaMA using GPTQ
☆3,072Jul 13, 2024Updated 2 years ago
open-wa / wa-automate-deploy-heroku
View on GitHub
Easy API deployment for Heroku
☆14Mar 6, 2023Updated 3 years ago
thomasantony / llamacpp-python
View on GitHub
Python bindings for llama.cpp
☆199Apr 22, 2023Updated 3 years ago
AlpinDale / pygmalion.cpp
View on GitHub
C/C++ implementation of PygmalionAI/pygmalion-6b
☆54Apr 18, 2023Updated 3 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
antimatter15 / alpaca.cpp
View on GitHub
Locally run an Instruction-Tuned Chat-Style LLM
☆10,128Apr 19, 2023Updated 3 years ago
paniphons / open-textbot-datasets
View on GitHub
Collection of various text datasets to assist ML researchers in training or fine-tuning their models
☆21Apr 1, 2023Updated 3 years ago
sahil280114 / codealpaca
View on GitHub
☆1,515May 12, 2023Updated 3 years ago
pointnetwork / point-alpaca
View on GitHub
☆402Mar 22, 2023Updated 3 years ago
lastmile-ai / llama-retrieval-plugin
View on GitHub
LLaMa retrieval plugin script using OpenAI's retrieval plugin
☆321Mar 27, 2023Updated 3 years ago
KerfuffleV2 / smolrsrwkv
View on GitHub
A relatively basic implementation of RWKV in Rust written by someone with very little math and ML knowledge. Supports 32, 8 and 4 bit eva…
☆95Sep 2, 2023Updated 2 years ago
abetlen / program-constrained-language-model-sampling
View on GitHub
☆35Apr 8, 2023Updated 3 years ago
wangchou / callCoreMLFromCppOrPython
View on GitHub
example of using CoreML from c++
☆24Jun 14, 2023Updated 3 years ago
markasoftware / llama-cpu
View on GitHub
Fork of Facebooks LLaMa model to run on CPU
☆766Mar 6, 2023Updated 3 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
gururise / AlpacaDataCleaned
View on GitHub
Alpaca dataset from Stanford, cleaned and curated
☆1,602Mar 7, 2026Updated 4 months ago
hahnyuan / RPTQ4LLM
View on GitHub
Reorder-based post-training quantization for large language model
☆199May 17, 2023Updated 3 years ago
bigcode-project / starcoder.cpp
View on GitHub
C++ implementation for 💫StarCoder
☆459Sep 9, 2023Updated 2 years ago
skeskinen / bert.cpp
View on GitHub
ggml implementation of BERT
☆501Feb 23, 2024Updated 2 years ago
ConiferLabsWA / flan-ul2-alpaca
View on GitHub
☆33Apr 23, 2023Updated 3 years ago
rustformers / llm
View on GitHub
[Unmaintained, see README] An ecosystem of Rust libraries for working with large language models
☆6,156Jun 24, 2024Updated 2 years ago
Lightning-AI / lit-llama
View on GitHub
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Ad…
☆6,082Jul 1, 2025Updated last year
eXpl0it3r / WhisperSFML
View on GitHub
Using OpenAI's Whisper via whisper.cpp with SFML
☆14Dec 2, 2025Updated 7 months ago
Noeda / rllama
View on GitHub
Rust+OpenCL+AVX2 implementation of LLaMA inference code
☆554Feb 12, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
AlpinDale / sparsegpt-for-LLaMA
View on GitHub
Code for the paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot" with LLaMA implementation.
☆71Mar 30, 2023Updated 3 years ago
teknium1 / GPTeacher
View on GitHub
A collection of modular datasets generated by GPT-4, General-Instruct - Roleplay-Instruct - Code-Instruct - and Toolformer
☆1,668Sep 15, 2023Updated 2 years ago
tloen / alpaca-lora
View on GitHub
Instruct-tune LLaMA on consumer hardware
☆18,912Jul 29, 2024Updated last year
lxe / simple-llm-finetuner
View on GitHub
Simple UI for LLM Model Finetuning
☆2,053Dec 21, 2023Updated 2 years ago
OpenNMT / CTranslate2
View on GitHub
Fast inference engine for Transformer models
☆4,585Jul 3, 2026Updated 3 weeks ago
harrisonvanderbyl / rwkv-cpp-accelerated
View on GitHub
A torchless, c++ rwkv implementation using 8bit quantization, written in cuda/hip/vulkan for maximum compatibility and minimum dependenci…
☆312Jan 31, 2024Updated 2 years ago
ravenscroftj / turbopilot
View on GitHub
Turbopilot is an open source large-language-model based code completion engine that runs locally on CPU
☆3,785Sep 30, 2023Updated 2 years ago