bcml-ai / rosa-plusLinks

ROSA+: RWKV's ROSA implementation with fallback statistical predictor

☆25

Alternatives and similar repositories for rosa-plus

Users that are interested in rosa-plus are comparing it to the libraries listed below

Sorting:

zyaaa-ux / ROSA-Tuning
ROSA-Tuning
☆59Updated 2 weeks ago
BlinkDL / modded-nanogpt-rwkv
RWKV-7: Surpassing GPT
☆101Updated last year
RWKV / ZeroCoT
https://x.com/BlinkDL_AI/status/1884768989743882276
☆28Updated 7 months ago
JoeLi12345 / nGPT
an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)
☆108Updated 9 months ago
kyleliang919 / Super_Muon
☆66Updated 9 months ago
OpenMOSE / RWKV-Infer
A large-scale RWKV v7(World, PRWKV, Hybrid-RWKV) inference. Capable of inference by combining multiple states(Pseudo MoE). Easy to deploy…
☆46Updated 2 months ago
yynil / RWKVInside
☆39Updated 7 months ago
Cornell-RelaxML / yaqa-quantization
☆66Updated 6 months ago
IST-DASLab / gptq-gguf-toolkit
Efficient non-uniform quantization with GPTQ for GGUF
☆57Updated 3 months ago
bloc97 / DeMo
DeMo: Decoupled Momentum Optimization
☆198Updated last year
OpenEvaByte / evabyte
EvaByte: Efficient Byte-level Language Models at Scale
☆111Updated 8 months ago
CerebrasResearch / reap
REAP: Router-weighted Expert Activation Pruning for SMoE compression
☆151Updated 2 weeks ago
jukofyork / transplant-vocab
Transplants vocabulary between language models, enabling the creation of draft models for speculative decoding WITHOUT retraining.
☆47Updated last month
reka-ai / rekaquant
☆62Updated 5 months ago
Zyphra / Zamba2
PyTorch implementation of models from the Zamba2 series.
☆186Updated 11 months ago
tinnerhrhe / ROVER
An official implementation of Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards
☆31Updated 2 months ago
tiiuae / onebitllms
Lightweight toolkit package to train and fine-tune 1.58bit Language models
☆104Updated 7 months ago
main-horse / hnet-old
H-Net Dynamic Hierarchical Architecture
☆80Updated 3 months ago
s-smits / grpo-optuna
Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna
☆59Updated 2 months ago
howard-hou / RWKV-X
RWKV-X is a Linear Complexity Hybrid Language Model based on the RWKV architecture, integrating Sparse Attention to improve the model's l…
☆53Updated 5 months ago
mkurman / grpo-llm-evaluator
Fine-tunes a student LLM using teacher feedback for improved reasoning and answer quality. Implements GRPO with teacher-provided evaluati…
☆47Updated 7 months ago
jadechip / nanoXLSTM
The simplest, fastest repository for training/finetuning medium-sized xLSTMs.
☆41Updated last year
vicksEmmanuel / latent-gemma
☆26Updated 11 months ago
HazyResearch / lolcats
Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"
☆249Updated 10 months ago
kubernetes-bad / reward-composer
Lego for GRPO
☆30Updated 6 months ago
VatsaDev / NanoPoor
NanoGPT-speedrunning for the poor T4 enjoyers
☆73Updated 8 months ago
tiiuae / Falcon-H1
All information and news with respect to Falcon-H1 series
☆94Updated 2 months ago
BorealisAI / neuzip
Official repository for the paper "NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks". This rep…
☆60Updated last year
zitian-gao / URM
☆47Updated last week
QuixiAI / grokadamw
☆137Updated last year