bcml-labs / rosa-plusLinks
ROSA+: RWKV's ROSA implementation with fallback statistical predictor
☆31Updated 3 months ago
Alternatives and similar repositories for rosa-plus
Users that are interested in rosa-plus are comparing it to the libraries listed below
Sorting:
- ROSA-Tuning☆65Updated last week
- RWKV-7: Surpassing GPT☆104Updated last year
- ☆67Updated 10 months ago
- ☆71Updated 7 months ago
- https://x.com/BlinkDL_AI/status/1884768989743882276☆28Updated 9 months ago
- A large-scale RWKV v7(World, PRWKV, Hybrid-RWKV) inference. Capable of inference by combining multiple states(Pseudo MoE). Easy to deploy…☆47Updated 3 months ago
- RWKV-X is a Linear Complexity Hybrid Language Model based on the RWKV architecture, integrating Sparse Attention to improve the model's l…☆54Updated 3 weeks ago
- Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)☆35Updated 11 months ago
- Work in progress.☆79Updated 2 months ago
- ☆29Updated 3 months ago
- Universal Reasoning Model☆122Updated 3 weeks ago
- EvaByte: Efficient Byte-level Language Models at Scale☆115Updated 9 months ago
- [ICML 2025] From Low Rank Gradient Subspace Stabilization to Low-Rank Weights: Observations, Theories and Applications☆52Updated 3 months ago
- H-Net Dynamic Hierarchical Architecture☆81Updated 5 months ago
- ☆41Updated 9 months ago
- Landing repository for the paper "Softpick: No Attention Sink, No Massive Activations with Rectified Softmax"☆86Updated 4 months ago
- An official implementation of Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards☆36Updated 4 months ago
- ☆91Updated last year
- Extending the Context of Pretrained LLMs by Dropping Their Positional Embedding☆203Updated 3 weeks ago
- ☆26Updated last year
- A repository for research on medium sized language models.☆77Updated last year
- Official repo of paper LM2☆46Updated 11 months ago
- All information and news with respect to Falcon-H1 series☆108Updated 4 months ago
- ☆54Updated last year
- [ICLR 2026] GRAPE: Group Representational Position Encoding (https://arxiv.org/abs/2512.07805)☆78Updated 2 weeks ago
- ☆119Updated last month
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆110Updated 11 months ago
- The evaluation framework for training-free sparse attention in LLMs☆117Updated 2 weeks ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆61Updated last year
- PyTorch implementation of models from the Zamba2 series.☆186Updated last year