FMInference / FlexLLMGenLinks

Running large language models on a single GPU for throughput-oriented scenarios.

☆9,379

Alternatives and similar repositories for FlexLLMGen

Users that are interested in FlexLLMGen are comparing it to the libraries listed below

Sorting:

tloen / alpaca-lora
Instruct-tune LLaMA on consumer hardware
☆18,981Updated last year
Lightning-AI / lit-llama
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Ad…
☆6,087Updated 4 months ago
antimatter15 / alpaca.cpp
Locally run an Instruction-Tuned Chat-Style LLM
☆10,196Updated 2 years ago
qwopqwop200 / GPTQ-for-LLaMa
4 bits quantization of LLaMA using GPTQ
☆3,079Updated last year
togethercomputer / RedPajama-Data
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
☆4,856Updated 11 months ago
artidoro / qlora
QLoRA: Efficient Finetuning of Quantized LLMs
☆10,766Updated last year
BlinkDL / RWKV-LM
RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable)…
☆14,152Updated 2 weeks ago
openlm-research / open_llama
OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset
☆7,528Updated 2 years ago
alpa-projects / alpa
Training and serving large-scale neural networks with auto parallelization.
☆3,166Updated last year
tatsu-lab / stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
☆30,226Updated last year
bitsandbytes-foundation / bitsandbytes
Accessible large language models via k-bit quantization for PyTorch.
☆7,767Updated last week
lm-sys / FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
☆39,263Updated 5 months ago
henrywoo / pyllama
LLaMA: Open and Efficient Foundation Language Models
☆2,801Updated 2 years ago
EleutherAI / gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
☆7,337Updated 2 months ago
OpenGVLab / LLaMA-Adapter
[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters
☆5,922Updated last year
bigscience-workshop / petals
🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
☆9,839Updated last year
ggml-org / ggml
Tensor library for machine learning
☆13,617Updated this week
nebuly-ai / optimate
A collection of libraries to optimise AI model performances
☆8,363Updated last year
nlpxucan / WizardLM
LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath
☆9,460Updated 5 months ago
facebookresearch / metaseq
Repo for external large-scale work
☆6,550Updated last year
AutoGPTQ / AutoGPTQ
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
☆4,992Updated 7 months ago
facebookincubator / AITemplate
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (N…
☆4,695Updated last month
project-baize / baize-chatbot
Let ChatGPT teach your own chatbot in hours with a single GPU!
☆3,168Updated last year
BlinkDL / ChatRWKV
ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.
☆9,512Updated 2 months ago
zai-org / GLM-130B
GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)
☆7,678Updated 2 years ago
shawwn / llama-dl
High-speed download of LLaMA, Facebook's 65B parameter GPT model
☆4,156Updated 2 years ago
mit-han-lab / streaming-llm
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
☆7,128Updated last year
Stability-AI / StableLM
StableLM: Stability AI Language Models
☆15,787Updated last year
young-geng / EasyLM
Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…
☆2,501Updated last year
NVIDIA / FasterTransformer
Transformer related optimization, including BERT, GPT
☆6,355Updated last year