Language modeling with linear-cost context
☆118Sep 25, 2025Updated 9 months ago
Alternatives and similar repositories for retention
Users that are interested in retention are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Marketplace ML experiment - training without backprop☆28Sep 9, 2025Updated 9 months ago
- RWKV-X is a Linear Complexity Hybrid Language Model based on the RWKV architecture, integrating Sparse Attention to improve the model's l…☆58Mar 31, 2026Updated 3 months ago
- An official implementation of Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards☆36Oct 3, 2025Updated 8 months ago
- ROSA-Tuning☆74Feb 4, 2026Updated 4 months ago
- Code repository for "RL Grokking Recipe: How RL Unlocks and Transfers New Algorithms in LLMs""☆35Oct 12, 2025Updated 8 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ROSA+: RWKV's ROSA implementation with fallback statistical predictor☆36Oct 13, 2025Updated 8 months ago
- ☆41Apr 30, 2025Updated last year
- Direct Preference Optimization for RWKV, aiming for RWKV-5 and 6.☆11Mar 1, 2024Updated 2 years ago
- ☆12Dec 21, 2024Updated last year
- Official Chinese documentation for RWKV | RWKV官方中文文档☆14Jun 10, 2026Updated 2 weeks ago
- ☆69Mar 21, 2025Updated last year
- RWKV v5,v6 LoRA Trainer on Cuda and Rocm Platform. RWKV is a RNN with transformer-level LLM performance. It can be directly trained like …☆13Mar 24, 2024Updated 2 years ago
- This repository contains the entire pipline (including data preprocessing, training, testing, evaluation and visualization) for the Shear…☆11Dec 3, 2019Updated 6 years ago
- ☆30Feb 27, 2024Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Implementation of BitNet-1.58 instruct tuning☆31Apr 14, 2024Updated 2 years ago
- Research work aimed at addressing the problem of modeling infinite-length context☆49Dec 18, 2025Updated 6 months ago
- An open-source UNet-based pipeline for nuclei segmentation in histopathology images using the PanNuke dataset. It features an interactive…☆11Jan 9, 2025Updated last year
- ☆19Sep 29, 2024Updated last year
- Compression performance of BPG, JPEG, JPEG2000 and Webp.☆12May 15, 2019Updated 7 years ago
- A PyTorch implementation of the shearlet transform.☆17Oct 9, 2025Updated 8 months ago
- 2D Gaussian splatting for image compression☆19Nov 29, 2024Updated last year
- Reinforcing General Reasoning without Verifiers☆101Jun 24, 2025Updated last year
- ☆14Aug 9, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Decompose skin into its independent components: heamoglobin and melanin components☆10Jan 6, 2015Updated 11 years ago
- Modeling code for a BitNet b1.58 Llama-style model.☆25Apr 30, 2024Updated 2 years ago
- Here we will test various linear attention designs.☆62Apr 25, 2024Updated 2 years ago
- ☆17Jan 1, 2025Updated last year
- Reactively track user's online, offline, and idle statuses☆10Jun 3, 2022Updated 4 years ago
- Visualize the low-level outputs of YOLOv8 to analyze and understand the areas where our model focuses. Specifically, illustrate which anc…☆15Feb 5, 2024Updated 2 years ago
- Attention Kernels for Symmetric Power Transformers☆130Sep 25, 2025Updated 9 months ago
- This repo contains the source code for the paper "Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning"☆364Jun 18, 2026Updated last week
- Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes☆11Jul 10, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆11Jun 14, 2019Updated 7 years ago
- [MICCAI 2024] Official code for the paper "MedContext: Learning Contextual Cues for Efficient Volumetric Medical Segmentation"☆14Nov 1, 2024Updated last year
- This repo is an exploratory experiment to enable frozen pretrained RWKV language models to accept speech modality input. We followed the …☆54Dec 23, 2024Updated last year
- Python script for encrypting and decrypting in the same as an enigma machine☆17Aug 15, 2011Updated 14 years ago
- A repository aimed at pruning DeepSeek V3, R1 and R1-zero to a usable size☆87Sep 5, 2025Updated 9 months ago
- ☆10Jul 13, 2024Updated last year
- 🔍 Code Search Tools & Experiments☆12Jun 4, 2026Updated 3 weeks ago