TorchRWKV / rwkv-kit
☆14Updated last month
Related projects ⓘ
Alternatives and complementary repositories for rwkv-kit
- ☆84Updated this week
- VisualRWKV is the visual-enhanced version of the RWKV language model, enabling RWKV to handle various visual tasks.☆183Updated 2 weeks ago
- ☆33Updated 4 months ago
- This is an inference framework for the RWKV large language model implemented purely in native PyTorch. The official native implementation…☆118Updated 4 months ago
- RWKV, in easy to read code☆55Updated this week
- Evaluating LLMs with Dynamic Data☆72Updated 2 weeks ago
- RWKV centralised docs for the community☆19Updated 2 months ago
- Direct Preference Optimization for RWKV, aiming for RWKV-5 and 6.☆11Updated 8 months ago
- This project is established for real-time training of the RWKV model.☆50Updated 6 months ago
- Unofficial Implementation of Evolutionary Model Merging☆33Updated 7 months ago
- ☆12Updated 3 months ago
- RWKV in nanoGPT style☆178Updated 5 months ago
- ☆14Updated this week
- RWKV infctx trainer, for training arbitary context sizes, to 10k and beyond!☆133Updated 3 months ago
- 用户友好、开箱即用的 RWKV Prompts 示例,适用于所有用户。Awesome RWKV Prompts for general users, more user-friendly, ready-to-use prompt examples.☆30Updated 3 months ago
- ☆17Updated 6 months ago
- Copy the MLP of llama3 8 times as 8 experts , created a router with random initialization,add load balancing loss to construct an 8x8b Mo…☆25Updated 4 months ago
- A fast RWKV Tokenizer written in Rust☆36Updated 2 months ago
- Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Mode…☆79Updated 2 months ago
- Implementation of the paper: "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆74Updated this week
- Unofficial implementations of block/layer-wise pruning methods for LLMs.☆51Updated 6 months ago
- rwkv finetuning☆36Updated 7 months ago
- ☆81Updated 6 months ago
- A repository for research on medium sized language models.☆74Updated 6 months ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆50Updated 7 months ago
- [NeurIPS 2024] Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Models☆175Updated this week
- continous batching and parallel acceleration for RWKV6☆23Updated 4 months ago
- [EMNLP 2024] RWKV-CLIP: A Robust Vision-Language Representation Learner☆112Updated last week
- RAG SYSTEM FOR RWKV☆36Updated last week
- Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton☆13Updated 2 weeks ago