shawwn / llamaLinks

Inference code for LLaMA models

☆188

Alternatives and similar repositories for llama

Users that are interested in llama are comparing it to the libraries listed below

Sorting:

pointnetwork / point-alpaca
☆405Updated 2 years ago
aigoopy / llm-jeopardy
Automated prompting and scoring framework to evaluate LLMs using updated human knowledge prompts
☆110Updated last year
venuatu / llama
Inference code for LLaMA models
☆46Updated 2 years ago
togethercomputer / redpajama.cpp
Extend the original llama.cpp repo to support redpajama model.
☆118Updated 10 months ago
modular-ml / wrapyfi-examples_llama
Inference code for facebook LLaMA models with Wrapyfi support
☆129Updated 2 years ago
lastmile-ai / llama-retrieval-plugin
LLaMa retrieval plugin script using OpenAI's retrieval plugin
☆324Updated 2 years ago
NolanoOrg / cformers
SoTA Transformers with C-backend for fast inference on your CPU.
☆309Updated last year
shawwn / scrap
Nearly a thousand bash and python scripts I've written over the years.
☆123Updated 5 months ago
eugenepentland / landmark-attention-qlora
Landmark Attention: Random-Access Infinite Context Length for Transformers QLoRA
☆123Updated 2 years ago
PotatoSpudowski / fastLLaMa
fastLLaMa: An experimental high-performance framework for running Decoder-only LLMs with 4-bit quantization in Python using a C/C++ backe…
☆409Updated 2 years ago
johnsmith0031 / alpaca_lora_4bit
☆535Updated last year
NouamaneTazi / bloomz.cpp
C++ implementation for BLOOM
☆810Updated 2 years ago
harrisonvanderbyl / rwkvstic
Framework agnostic python runtime for RWKV models
☆147Updated last year
cmp-nct / ggllm.cpp
Falcon LLM ggml framework with CPU and GPU support
☆246Updated last year
zphang / minimal-llama
☆458Updated last year
devbrones / llama-prompts
A collection of prompts for Llama
☆100Updated 2 years ago
tloen / llama-int8
Quantized inference code for LLaMA models
☆1,050Updated 2 years ago
thomasantony / llamacpp-python
Python bindings for llama.cpp
☆197Updated 2 years ago
shawwn / openai-server
OpenAI API webserver
☆188Updated 3 years ago
markasoftware / llama-cpu
Fork of Facebooks LLaMa model to run on CPU
☆772Updated 2 years ago
s4rduk4r / alpaca_lora_4bit_readme
Just a simple HowTo for https://github.com/johnsmith0031/alpaca_lora_4bit
☆31Updated 2 years ago
petals-infra / chat.petals.dev
💬 Chatbot web app + HTTP and Websocket endpoints for LLM inference with the Petals client
☆313Updated last year
arrmansa / Basic-UI-for-GPT-J-6B-with-low-vram
A repository to run gpt-j-6b on low vram machines (4.2 gb minimum vram for 2000 token context, 3.5 gb for 1000 token context). Model load…
☆114Updated 3 years ago
AlpinDale / sparsegpt-for-LLaMA
Code for the paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot" with LLaMA implementation.
☆71Updated 2 years ago
skeskinen / llama-lite
Embeddings focused small version of Llama NLP model
☆103Updated 2 years ago
chris-alexiuk / alpaca-lora
Instruct-tune LLaMA on consumer hardware
☆362Updated 2 years ago
gmorenz / llama
Inference code for LLaMA models
☆35Updated 2 years ago
mayank31398 / GPTQ-for-SantaCoder
4 bits quantization of SantaCoder using GPTQ
☆51Updated 2 years ago
AmericanPresidentJimmyCarter / yal-discord-bot
Yet Another LLaMA/ALPACA Discord Bot
☆70Updated 2 years ago
geov-ai / geov
The GeoV model is a large langauge model designed by Georges Harik and uses Rotary Positional Embeddings with Relative distances (RoPER).…
☆121Updated 2 years ago