BlinkDL / RWKV-LMLinks

RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RNN and transformer - great performance, linear time, constant space (no kv-cache), fast training, infinite ctx_len, and free sentence embedding.

☆14,088

Alternatives and similar repositories for RWKV-LM

Users that are interested in RWKV-LM are comparing it to the libraries listed below

Sorting:

BlinkDL / ChatRWKV
ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.
☆9,510Updated last month
tloen / alpaca-lora
Instruct-tune LLaMA on consumer hardware
☆18,977Updated last year
lm-sys / FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
☆39,199Updated 5 months ago
Lightning-AI / lit-llama
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Ad…
☆6,082Updated 4 months ago
tatsu-lab / stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
☆30,190Updated last year
artidoro / qlora
QLoRA: Efficient Finetuning of Quantized LLMs
☆10,719Updated last year
OpenGVLab / LLaMA-Adapter
[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters
☆5,910Updated last year
Dao-AILab / flash-attention
Fast and memory-efficient exact attention
☆20,280Updated this week
zai-org / GLM-130B
GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)
☆7,681Updated 2 years ago
huggingface / peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
☆19,959Updated this week
FMInference / FlexLLMGen
Running large language models on a single GPU for throughput-oriented scenarios.
☆9,369Updated last year
microsoft / LoRA
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
☆12,843Updated 10 months ago
nlpxucan / WizardLM
LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath
☆9,459Updated 4 months ago
Stability-AI / StableLM
StableLM: Stability AI Language Models
☆15,795Updated last year
Vision-CAIR / MiniGPT-4
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
☆25,744Updated last year
OptimalScale / LMFlow
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
☆8,473Updated 2 months ago
openlm-research / open_llama
OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset
☆7,523Updated 2 years ago
mit-han-lab / streaming-llm
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
☆7,102Updated last year
facebookresearch / xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
☆10,052Updated this week
haotian-liu / LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
☆23,870Updated last year
deepspeedai / DeepSpeedExamples
Example models using DeepSpeed
☆6,701Updated 2 weeks ago
Instruction-Tuning-with-GPT-4 / GPT-4-LLM
Instruction Tuning with GPT-4
☆4,337Updated 2 years ago
bitsandbytes-foundation / bitsandbytes
Accessible large language models via k-bit quantization for PyTorch.
☆7,687Updated last week
togethercomputer / RedPajama-Data
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
☆4,839Updated 10 months ago
LianjiaTech / BELLE
BELLE: Be Everyone's Large Language model Engine（开源中文对话大模型）
☆8,245Updated last year
yizhongw / self-instruct
Aligning pretrained language models with instruction data generated by themselves.
☆4,507Updated 2 years ago
deepspeedai / DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
☆40,538Updated this week
microsoft / unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
☆21,798Updated 4 months ago
henrywoo / pyllama
LLaMA: Open and Efficient Foundation Language Models
☆2,801Updated last year
meta-llama / llama
Inference code for Llama models
☆58,886Updated 9 months ago