kuleshov / minillmLinks

MiniLLM is a minimal system for running modern LLMs on consumer-grade GPUs

☆915

Alternatives and similar repositories for minillm

Users that are interested in minillm are comparing it to the libraries listed below

Sorting:

sahil280114 / codealpaca
☆1,476Updated 2 years ago
tloen / llama-int8
Quantized inference code for LLaMA models
☆1,050Updated 2 years ago
young-geng / EasyLM
Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…
☆2,482Updated 11 months ago
lxe / simple-llm-finetuner
Simple UI for LLM Model Finetuning
☆2,062Updated last year
kuleshov-group / llmtools
Finetuning Large Language Models on One Consumer GPU in 2 Bits
☆726Updated last year
gururise / AlpacaDataCleaned
Alpaca dataset from Stanford, cleaned and curated
☆1,561Updated 2 years ago
johnsmith0031 / alpaca_lora_4bit
☆535Updated last year
NouamaneTazi / bloomz.cpp
C++ implementation for BLOOM
☆810Updated 2 years ago
jankais3r / LLaMA_MPS
Run LLaMA (and Stanford-Alpaca) inference on Apple Silicon GPUs.
☆588Updated 2 years ago
rmihaylov / falcontune
Tune any FALCON in 4-bit
☆467Updated last year
randaller / llama-chat
Chat with Meta's LLaMA models at home made easy
☆837Updated 2 years ago
zphang / minimal-llama
☆458Updated last year
qwopqwop200 / GPTQ-for-LLaMa
4 bits quantization of LLaMA using GPTQ
☆3,053Updated last year
mbzuai-nlp / LaMini-LM
LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions
☆819Updated 2 years ago
abertsch72 / unlimiformer
Public repo for the NeurIPS 2023 paper "Unlimiformer: Long-Range Transformers with Unlimited Length Input"
☆1,062Updated last year
melodysdreamj / WizardVicunaLM
LLM that combines the principles of wizardLM and vicunaLM
☆716Updated 2 years ago
jondurbin / airoboros
Customizable implementation of the self-instruct paper.
☆1,047Updated last year
s-JoL / Open-Llama
The complete training code of the open-source high-performance Llama model, including the full process from pre-training to RLHF.
☆47Updated 2 years ago
marella / ctransformers
Python bindings for the Transformer models implemented in C/C++ using GGML library.
☆1,868Updated last year
abacaj / fine-tune-mistral
Fine-tune mistral-7B on 3090s, a100s, h100s
☆715Updated last year
bigcode-project / starcoder.cpp
C++ implementation for 💫StarCoder
☆455Updated last year
stochasticai / xTuring
Build, customize and control you own LLMs. From data pre-processing to fine-tuning, xTuring provides an easy way to personalize open-sour…
☆2,659Updated 9 months ago
trholding / llama2.c
Llama 2 Everywhere (L2E)
☆1,519Updated 6 months ago
teknium1 / GPTeacher
A collection of modular datasets generated by GPT-4, General-Instruct - Roleplay-Instruct - Code-Instruct - and Toolformer
☆1,637Updated last year
persimmon-ai-labs / adept-inference
Inference code for Persimmon-8B
☆415Updated last year
pointnetwork / point-alpaca
☆405Updated 2 years ago
turboderp / exllama
A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
☆2,887Updated last year
hyperonym / basaran
Basaran is an open-source alternative to the OpenAI text completion API. It provides a compatible streaming API for your Hugging Face Tra…
☆1,298Updated last year
markasoftware / llama-cpu
Fork of Facebooks LLaMa model to run on CPU
☆772Updated 2 years ago
PotatoSpudowski / fastLLaMa
fastLLaMa: An experimental high-performance framework for running Decoder-only LLMs with 4-bit quantization in Python using a C/C++ backe…
☆409Updated 2 years ago