tensorpro / tpu_rwkvLinks

JAX implementations of RWKV

☆19

Alternatives and similar repositories for tpu_rwkv

Users that are interested in tpu_rwkv are comparing it to the libraries listed below

Sorting:

cwhy / rwkv-decon
Trying to deconstruct RWKV in understandable terms
☆14Updated 2 years ago
BlinkDL / WorldModel
Let us make Psychohistory (as in Asimov) a reality, and accessible to everyone. Useful for LLM grounding and games / fiction / business /…
☆40Updated 2 years ago
harrisonvanderbyl / rwkvstic
Framework agnostic python runtime for RWKV models
☆147Updated 2 years ago
UnstoppableCurry / RWKV-LM-Interpretability-Research
Interpretability analysis of language model outlier and attempts to distill the model
☆13Updated 2 years ago
ArEnSc / Production-RWKV
This project aims to make RWKV Accessible to everyone using a Hugging Face like interface, while keeping it close to the R and D RWKV bra…
☆65Updated 2 years ago
mrsteyk / RWKV-LM-deepspeed
☆43Updated 2 years ago
jiamingkong / rwkv_reward
Training a reward model for RLHF using RWKV.
☆15Updated 2 years ago
BlinkDL / nanoRWKV
RWKV in nanoGPT style
☆196Updated last year
codekansas / rwkv
RWKV model implementation
☆38Updated 2 years ago
wozeparrot / tinyrwkv
tinygrad port of the RWKV large language model.
☆45Updated 8 months ago
RWKV / rwkv-onnx
A converter and basic tester for rwkv onnx
☆43Updated last year
tobiaskatsch / GatedLinearRNN
☆29Updated last year
lachlansneff / sparsellama
☆40Updated 2 years ago
BlinkDL / fast.c
Prepare for DeekSeek R1 inference: Benchmark CPU, DRAM, SSD, iGPU, GPU, ... with efficient code.
☆73Updated 9 months ago
catid / bitnet_cpu
Experiments with BitNet inference on CPU
☆54Updated last year
BlinkDL / SmallInitEmb
LayerNorm(SmallInit(Embedding)) in a Transformer to improve convergence
☆61Updated 3 years ago
lukasVierling / FaceRWKV
Course Project for COMP4471 on RWKV
☆17Updated last year
SmerkyG / RWKV_Explained
RWKV, in easy to read code
☆72Updated 8 months ago
kaiokendev / cutoff-len-is-context-len
Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit
☆63Updated 2 years ago
AXKuhta / rwkv-onnx-dml
Run ONNX RWKV-v4 models with GPU acceleration using DirectML [Windows], or just on CPU [Windows AND Linux]; Limited to 430M model at this…
☆21Updated 2 years ago
harrisonvanderbyl / rwkv-cpp-accelerated
A torchless, c++ rwkv implementation using 8bit quantization, written in cuda/hip/vulkan for maximum compatibility and minimum dependenci…
☆313Updated last year
euclaise / supertrainer2000
☆50Updated last year
OpenMOSE / RWKV-Infer
A large-scale RWKV v7(World, PRWKV, Hybrid-RWKV) inference. Capable of inference by combining multiple states(Pseudo MoE). Easy to deploy…
☆45Updated last month
recursal / GoldFinch-paper
GoldFinch and other hybrid transformer components
☆45Updated last year
josephrocca / rwkv-v4-web
BlinkDL's RWKV-v4 running in the browser
☆47Updated 2 years ago
schwartz-lab-NLP / TOVA
Token Omission Via Attention
☆127Updated last year
ethansmith2000 / TransformerExperiments
☆19Updated 6 months ago
SmerkyG / gptcore
Fast modular code to create and train cutting edge LLMs
☆68Updated last year
RWKV / RWKV-infctx-trainer
RWKV infctx trainer, for training arbitary context sizes, to 10k and beyond!
☆147Updated last year
karim-aloulou / Espitchatbot-RASA-RAVEN
Chatbot that answers frequently asked questions in French, English, and Tunisian using the Rasa NLU framework and RWKV-4-Raven
☆13Updated 2 years ago