lukasVierling / FaceRWKV
Course Project for COMP4471 on RWKV
☆17Updated last year
Alternatives and similar repositories for FaceRWKV:
Users that are interested in FaceRWKV are comparing it to the libraries listed below
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆28Updated this week
- RWKV, in easy to read code☆69Updated 3 months ago
- Experiments with BitNet inference on CPU☆53Updated 11 months ago
- The simplest, fastest repository for training/finetuning medium-sized xLSTMs.☆41Updated 9 months ago
- Training a reward model for RLHF using RWKV.☆14Updated last year
- RWKV-7: Surpassing GPT☆80Updated 3 months ago
- Run ONNX RWKV-v4 models with GPU acceleration using DirectML [Windows], or just on CPU [Windows AND Linux]; Limited to 430M model at this…☆20Updated last year
- A converter and basic tester for rwkv onnx☆42Updated last year
- Reinforcement Learning Toolkit for RWKV.(v6,v7,ARWKV) Distillation,SFT,RLHF(DPO,ORPO), infinite context training, Aligning. Exploring the…☆33Updated last week
- GoldFinch and other hybrid transformer components☆45Updated 7 months ago
- ☆34Updated 7 months ago
- My Implementation of Q-Sparse: All Large Language Models can be Fully Sparsely-Activated☆31Updated 6 months ago
- tinygrad port of the RWKV large language model.☆44Updated this week
- RWKV v5,v6 LoRA Trainer on Cuda and Rocm Platform. RWKV is a RNN with transformer-level LLM performance. It can be directly trained like …☆12Updated 11 months ago
- A fast RWKV Tokenizer written in Rust☆42Updated 6 months ago
- RWKV in nanoGPT style☆187Updated 9 months ago
- BlinkDL's RWKV-v4 running in the browser☆47Updated 2 years ago
- ☆49Updated 11 months ago
- ☆108Updated this week
- A large-scale RWKV v6, v7(World, ARWKV) inference. Capable of inference by combining multiple states(Pseudo MoE). Easy to deploy on docke…☆31Updated 2 weeks ago
- Trying to deconstruct RWKV in understandable terms☆14Updated last year
- ☆13Updated last year
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆31Updated 9 months ago
- entropix style sampling + GUI☆25Updated 4 months ago
- Lightweight continuous batching OpenAI compatibility using HuggingFace Transformers include T5 and Whisper.☆21Updated this week