codefuse-ai / rodimusLinks
☆165Updated 7 months ago
Alternatives and similar repositories for rodimus
Users that are interested in rodimus are comparing it to the libraries listed below
Sorting:
- RWKV-X is a Linear Complexity Hybrid Language Model based on the RWKV architecture, integrating Sparse Attention to improve the model's l…☆52Updated 5 months ago
- My Implementation of Q-Sparse: All Large Language Models can be Fully Sparsely-Activated☆33Updated last year
- RADLADS training code☆35Updated 7 months ago
- Code Implementation, Evaluations, Documentation, Links and Resources for Min P paper☆45Updated 4 months ago
- A repository for research on medium sized language models.☆77Updated last year
- ☆39Updated 7 months ago
- https://x.com/BlinkDL_AI/status/1884768989743882276☆28Updated 7 months ago
- ☆70Updated last year
- GoldFinch and other hybrid transformer components☆45Updated last year
- A large-scale RWKV v7(World, PRWKV, Hybrid-RWKV) inference. Capable of inference by combining multiple states(Pseudo MoE). Easy to deploy…☆46Updated 2 months ago
- Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization☆80Updated 2 months ago
- [ACL 2025] An inference-time decoding strategy with adaptive foresight sampling☆105Updated 7 months ago
- EvaByte: Efficient Byte-level Language Models at Scale☆111Updated 7 months ago
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆27Updated last year
- RWKV-7: Surpassing GPT☆101Updated last year
- An unofficial pytorch implementation of 'Efficient Infinite Context Transformers with Infini-attention'☆54Updated last year
- ☆53Updated last year
- [ICML 2025] From Low Rank Gradient Subspace Stabilization to Low-Rank Weights: Observations, Theories and Applications☆52Updated last month
- ☆66Updated 9 months ago
- Here we will test various linear attention designs.☆62Updated last year
- ☆85Updated last month
- ☆29Updated last month
- SSRL: Self-Search Reinforcement Learning☆158Updated 4 months ago
- GoldFinch and other hybrid transformer components☆12Updated last week
- Official implementation of GRAPE: Group Representational Position Encoding (https://arxiv.org/abs/2512.07805)☆65Updated last week
- ☆94Updated last year
- ☆27Updated 4 months ago
- ☆39Updated last year
- ☆110Updated last year
- Official Implementation of APB (ACL 2025 main Oral)☆32Updated 9 months ago