subconscious-systems / TIMRUNLinks
☆39Updated 3 weeks ago
Alternatives and similar repositories for TIMRUN
Users that are interested in TIMRUN are comparing it to the libraries listed below
Sorting:
- ☆19Updated 5 months ago
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆31Updated last week
- ☆66Updated 4 months ago
- The official implementation for paper "Agentic-R1: Distilled Dual-Strategy Reasoning"☆89Updated 3 weeks ago
- Implementation of the paper: "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" from Google in pyTO…☆56Updated this week
- ☆51Updated 2 months ago
- Resa: Transparent Reasoning Models via SAEs☆41Updated this week
- Verifiers for LLM Reinforcement Learning☆69Updated 4 months ago
- Lottery Ticket Adaptation☆39Updated 8 months ago
- Esoteric Language Models☆92Updated 3 weeks ago
- Systematic evaluation framework that automatically rates overthinking behavior in large language models.☆92Updated 3 months ago
- ☆25Updated last month
- Efficient Agent Training for Computer Use☆122Updated 2 months ago
- List of papers on Self-Correction of LLMs.☆74Updated 7 months ago
- [ACL 2025] Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆100Updated 2 months ago
- Official Repository for Task-Circuit Quantization☆22Updated 2 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆60Updated 11 months ago
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling☆33Updated 2 weeks ago
- [ICML 24 NGSM workshop] Associative Recurrent Memory Transformer implementation and scripts for training and evaluation☆43Updated this week
- The code for paper: Decoupled Planning and Execution: A Hierarchical Reasoning Framework for Deep Search☆53Updated last month
- A repository for research on medium sized language models.☆78Updated last year
- ☆90Updated 3 months ago
- DPO, but faster 🚀☆44Updated 8 months ago
- Router-R1: Teaching LLMs Multi-Round Routing and Aggregation via Reinforcement Learning☆35Updated last month
- ☆66Updated last month
- Train, tune, and infer Bamba model☆131Updated 2 months ago
- ☆48Updated 11 months ago
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆82Updated this week
- Code and data for the paper "Why think step by step? Reasoning emerges from the locality of experience"☆61Updated 4 months ago
- ☆40Updated 3 months ago