markasoftware / llama-cpuLinks

Fork of Facebooks LLaMa model to run on CPU

☆771

Alternatives and similar repositories for llama-cpu

Users that are interested in llama-cpu are comparing it to the libraries listed below

Sorting:

tloen / llama-int8
Quantized inference code for LLaMA models
☆1,047Updated 2 years ago
NouamaneTazi / bloomz.cpp
C++ implementation for BLOOM
☆808Updated 2 years ago
lxe / simple-llm-finetuner
Simple UI for LLM Model Finetuning
☆2,065Updated last year
chris-alexiuk / alpaca-lora
Instruct-tune LLaMA on consumer hardware
☆362Updated 2 years ago
trholding / llama2.c
Llama 2 Everywhere (L2E)
☆1,521Updated 3 months ago
randaller / llama-chat
Chat with Meta's LLaMA models at home made easy
☆840Updated 2 years ago
okuvshynov / slowllama
Finetune llama2-70b and codellama on MacBook Air without quantization
☆450Updated last year
lastmile-ai / llama-retrieval-plugin
LLaMa retrieval plugin script using OpenAI's retrieval plugin
☆324Updated 2 years ago
replit / ReplitLM
Inference code and configs for the ReplitLM model family
☆1,010Updated 2 years ago
sahil280114 / codealpaca
☆1,498Updated 2 years ago
s-JoL / Open-Llama
The complete training code of the open-source high-performance Llama model, including the full process from pre-training to RLHF.
☆52Updated 2 years ago
shawwn / llama
Inference code for LLaMA models
☆189Updated 2 years ago
hyperonym / basaran
Basaran is an open-source alternative to the OpenAI text completion API. It provides a compatible streaming API for your Hugging Face Tra…
☆1,294Updated last year
danielgross / LlamaAcademy
A school for camelids
☆1,208Updated 2 years ago
jankais3r / LLaMA_MPS
Run LLaMA (and Stanford-Alpaca) inference on Apple Silicon GPUs.
☆585Updated 2 years ago
pointnetwork / point-alpaca
☆404Updated 2 years ago
Maknee / minigpt4.cpp
Port of MiniGPT4 in C++ (4bit, 5bit, 6bit, 8bit, 16bit CPU inference with GGML)
☆568Updated 2 years ago
kuleshov / minillm
MiniLLM is a minimal system for running modern LLMs on consumer-grade GPUs
☆936Updated 2 years ago
abacaj / mpt-30B-inference
Run inference on MPT-30B using CPU
☆576Updated 2 years ago
zanussbaum / gpt4all.cpp
Locally run an Assistant-Tuned Chat-Style LLM
☆497Updated 2 years ago
fafrd / aquarium
AI-controlled Linux Containers
☆668Updated 4 months ago
PotatoSpudowski / fastLLaMa
fastLLaMa: An experimental high-performance framework for running Decoder-only LLMs with 4-bit quantization in Python using a C/C++ backe…
☆412Updated 2 years ago
r2d4 / openlm
OpenAI-compatible Python client that can call any LLM
☆373Updated 2 years ago
wgryc / phasellm
Large language model evaluation and workflow framework from Phase AI.
☆459Updated 10 months ago
anysphere / gpt-4-for-code
Some examples of GPT-4 for code!
☆496Updated 2 years ago
bigcode-project / starcoder.cpp
C++ implementation for 💫StarCoder
☆457Updated 2 years ago
zphang / minimal-llama
☆457Updated 2 years ago
santiagobasulto / ipython-gpt
An ChatGPT integration for Jupyter Notebooks and the IPython Shell
☆589Updated 2 years ago
piercefreeman / gpt-json
Structured and typehinted GPT responses in Python
☆742Updated last year
Futrell / ziplm
☆254Updated 2 years ago