ArEnSc / Production-RWKVLinks

This project aims to make RWKV Accessible to everyone using a Hugging Face like interface, while keeping it close to the R and D RWKV branch of code.

☆65

Alternatives and similar repositories for Production-RWKV

Users that are interested in Production-RWKV are comparing it to the libraries listed below

Sorting:

harrisonvanderbyl / rwkvstic
Framework agnostic python runtime for RWKV models
☆147Updated 2 years ago
mrsteyk / RWKV-LM-deepspeed
☆43Updated 2 years ago
BlinkDL / RWKV-v2-RNN-Pile
RWKV-v2-RNN trained on the Pile. See https://github.com/BlinkDL/RWKV-LM for details.
☆67Updated 3 years ago
kyleliang919 / Long-context-transformers
Exploring finetuning public checkpoints on filter 8K sequences on Pile
☆116Updated 2 years ago
kaiokendev / cutoff-len-is-context-len
Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit
☆63Updated 2 years ago
BlinkDL / WorldModel
Let us make Psychohistory (as in Asimov) a reality, and accessible to everyone. Useful for LLM grounding and games / fiction / business /…
☆40Updated 2 years ago
Rallio67 / language-model-agents
Experiments with generating opensource language model assistants
☆97Updated 2 years ago
zphang / minimal-gpt-neox-20b
☆131Updated 3 years ago
rmihaylov / mpttune
Tune MPTs
☆84Updated 2 years ago
geov-ai / geov
The GeoV model is a large langauge model designed by Georges Harik and uses Rotary Positional Embeddings with Relative distances (RoPER).…
☆121Updated 2 years ago
kernelmachine / cbtm
Code repository for the c-BTM paper
☆108Updated 2 years ago
NolanoOrg / sparse_quant_llms
SparseGPT + GPTQ Compression of LLMs like LLaMa, OPT, Pythia
☆41Updated 2 years ago
euclaise / supertrainer2000
☆50Updated last year
AeroScripts / HiddenEngrams
Hidden Engrams: Long Term Memory for Transformer Model Inference
☆35Updated 4 years ago
EleutherAI / magiCARP
One stop shop for all things carp
☆59Updated 3 years ago
BlinkDL / SmallInitEmb
LayerNorm(SmallInit(Embedding)) in a Transformer to improve convergence
☆61Updated 3 years ago
huu4ontocord / MDEL
Multi-Domain Expert Learning
☆67Updated last year
RWKV / RWKV-infctx-trainer
RWKV infctx trainer, for training arbitary context sizes, to 10k and beyond!
☆147Updated last year
AXKuhta / rwkv-onnx-dml
Run ONNX RWKV-v4 models with GPU acceleration using DirectML [Windows], or just on CPU [Windows AND Linux]; Limited to 430M model at this…
☆21Updated 2 years ago
McGill-NLP / length-generalization
Code for the paper "The Impact of Positional Encoding on Length Generalization in Transformers", NeurIPS 2023
☆138Updated last year
codekansas / rwkv
RWKV model implementation
☆38Updated 2 years ago
harrisonvanderbyl / rwkv_chatbot
rwkv_chatbot
☆62Updated 2 years ago
Birch-san / booru-embed
[WIP] Transformer to embed Danbooru labelsets
☆13Updated last year
cat-state / tinypar
☆20Updated 2 years ago
ConiferLabsWA / flan-ul2-alpaca
☆33Updated 2 years ago
SeanNaren / min-LLM
Minimal code to train a Large Language Model (LLM).
☆172Updated 3 years ago
lucidrains / memory-editable-transformer
My explorations into editing the knowledge and memories of an attention network
☆35Updated 3 years ago
crowsonkb / LDLM
Latent Diffusion Language Models
☆70Updated 2 years ago
lucidrains / CoLT5-attention
Implementation of the conditionally routed attention in the CoLT5 architecture, in Pytorch
☆230Updated last year
Zyphra / Zyda_processing
☆39Updated last year