RWKV / RWKV-infctx-trainerLinks

RWKV infctx trainer, for training arbitary context sizes, to 10k and beyond!

☆147

Alternatives and similar repositories for RWKV-infctx-trainer

Users that are interested in RWKV-infctx-trainer are comparing it to the libraries listed below

Sorting:

Joluck / RWKV-PEFT
☆148Updated 2 months ago
Abel2076 / json2binidx_tool
☆81Updated last year
neromous / RWKV-Ouroboros
This project is established for real-time training of the RWKV model.
☆49Updated last year
Jellyfish042 / uncheatable_eval
Evaluating LLMs with Dynamic Data
☆96Updated 3 months ago
OpenMOSE / RWKV-LM-RLHF
Reinforcement Learning Toolkit for RWKV.(v6,v7,ARWKV) Distillation,SFT,RLHF(DPO,ORPO), infinite context training, Aligning. Exploring the…
☆54Updated last month
BlinkDL / nanoRWKV
RWKV in nanoGPT style
☆193Updated last year
SmerkyG / gptcore
Fast modular code to create and train cutting edge LLMs
☆68Updated last year
cryscan / web-rwkv-inspector
☆13Updated 10 months ago
RWKV / RWKV-wiki
RWKV centralised docs for the community
☆29Updated 2 months ago
OpenMOSE / RWKV5-LM-LoRA
RWKV v5,v6 LoRA Trainer on Cuda and Rocm Platform. RWKV is a RNN with transformer-level LLM performance. It can be directly trained like …
☆13Updated last year
Gryphe / MergeMonster
An unsupervised model merging algorithm for Transformers-based language models.
☆106Updated last year
OpenMOSE / RWKV-Infer
A large-scale RWKV v7(World, PRWKV, Hybrid-RWKV) inference. Capable of inference by combining multiple states(Pseudo MoE). Easy to deploy…
☆45Updated last week
BlinkDL / modded-nanogpt-rwkv
RWKV-7: Surpassing GPT
☆98Updated 11 months ago
harrisonvanderbyl / rwkvstic
Framework agnostic python runtime for RWKV models
☆146Updated 2 years ago
yynil / RWKVinLLAMA
☆17Updated 9 months ago
Jellyfish042 / RWKV-StateTuning
State tuning tunes the state
☆35Updated 8 months ago
astramind-ai / BitMat
An efficent implementation of the method proposed in "The Era of 1-bit LLMs"
☆154Updated last year
BBuf / RWKV-World-HF-Tokenizer
☆34Updated last year
Blealtan / RWKV-LM-LoRA
RWKV is a RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best …
☆412Updated 2 years ago
SmerkyG / RWKV_Explained
RWKV, in easy to read code
☆72Updated 7 months ago
yynil / RWKVInside
☆38Updated 5 months ago
shoumenchougou / Awesome-RWKV-Prompts
用户友好、开箱即用的 RWKV Prompts 示例，适用于所有用户。Awesome RWKV Prompts for general users, more user-friendly, ready-to-use prompt examples.
☆36Updated 9 months ago
keeeeenw / MicroLlama
Micro Llama is a small Llama based model with 300M parameters trained from scratch with $500 budget
☆161Updated 2 months ago
RWKV / rwkv-onnx
A converter and basic tester for rwkv onnx
☆42Updated last year
VITA-Group / Q-GaLore
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.
☆202Updated last year
uukuguy / multi_loras
Load multiple LoRA modules simultaneously and automatically switch the appropriate combination of LoRA modules to generate the best answe…
☆158Updated last year
Digitous / LLM-SLERP-Merge
Spherical Merge Pytorch/HF format Language Models with minimal feature loss.
☆138Updated 2 years ago
Jellyfish042 / Sudoku-RWKV
☆147Updated 11 months ago
Gryphe / BlockMerge_Gradient
Merge Transformers language models by use of gradient parameters.
☆207Updated last year
cahya-wirawan / rwkv-tokenizer
A fast RWKV Tokenizer written in Rust
☆53Updated 2 months ago