shangshang-wang / ResaLinks
Resa: Transparent Reasoning Models via SAEs
☆41Updated last month
Alternatives and similar repositories for Resa
Users that are interested in Resa are comparing it to the libraries listed below
Sorting:
- Official Repo for RuleReasoner.☆26Updated 3 months ago
- Lottery Ticket Adaptation☆39Updated 9 months ago
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling☆34Updated 2 weeks ago
- A Recipe for Building LLM Reasoners to Solve Complex Instructions☆22Updated last month
- CS194-196 Course Project☆15Updated 6 months ago
- The official repo for “Unleashing the Reasoning Potential of Pre-trained LLMs by Critique Fine-Tuning on One Problem” [EMNLP25]☆31Updated last week
- The official implementation of Regularized Policy Gradient (RPG) (https://arxiv.org/abs/2505.17508)☆36Updated this week
- ☆16Updated last year
- ☆26Updated 2 months ago
- [ACL2025 Findings] Benchmarking Multihop Multimodal Internet Agents☆46Updated 6 months ago
- Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers☆20Updated 6 months ago
- 😊 TPTT: Transforming Pretrained Transformers into Titans☆26Updated this week
- Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025☆30Updated 4 months ago
- ☆39Updated 2 months ago
- ☆47Updated 7 months ago
- Official implementation of ECCV24 paper: POA☆24Updated last year
- Code for reproducing our paper "Low Rank Adapting Models for Sparse Autoencoder Features"☆14Updated 5 months ago
- ☆20Updated last month
- ☆34Updated 8 months ago
- Geometric-Mean Policy Optimization☆74Updated last month
- Official implementation of "Reasoning Path Compression: Compressing Generation Trajectories for Efficient LLM Reasoning"☆21Updated 3 months ago
- [ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework☆70Updated 3 months ago
- ☆26Updated 2 months ago
- Bayes-Adaptive RL for LLM Reasoning☆38Updated 3 months ago
- [ICML 2025] Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"☆16Updated 6 months ago
- Code for paper: Long cOntext aliGnment via efficient preference Optimization☆13Updated 6 months ago
- Code for "C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing"☆18Updated 5 months ago
- ☆19Updated 8 months ago
- A holistic benchmark for LLM abstention☆51Updated 2 weeks ago
- ☆85Updated last year