suzgunmirac / dynamic-cheatsheet
Dynamic Cheatsheet: Test-Time Learning with Adaptive Memory
☆56Updated 3 weeks ago
Alternatives and similar repositories for dynamic-cheatsheet:
Users that are interested in dynamic-cheatsheet are comparing it to the libraries listed below
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆55Updated 8 months ago
- [ICML 2025] Flow of Reasoning: Training LLMs for Divergent Problem Solving with Minimal Examples☆85Updated last month
- ☆72Updated 6 months ago
- ☆114Updated 2 months ago
- Systematic evaluation framework that automatically rates overthinking behavior in large language models.☆88Updated last month
- ☆60Updated last year
- Replicating O1 inference-time scaling laws☆84Updated 5 months ago
- Code for Paper: Autonomous Evaluation and Refinement of Digital Agents [COLM 2024]☆135Updated 5 months ago
- SiriuS: Self-improving Multi-agent Systems via Bootstrapped Reasoning☆53Updated last month
- Implementation of the Quiet-STAR paper (https://arxiv.org/pdf/2403.09629.pdf)☆53Updated 9 months ago
- Scalable Meta-Evaluation of LLMs as Evaluators☆42Updated last year
- Official repository for paper "ReasonIR Training Retrievers for Reasoning Tasks".☆112Updated last week
- Code for "Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate"☆141Updated 2 weeks ago
- Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]☆139Updated 6 months ago
- Exploration of automated dataset selection approaches at large scales.☆39Updated 2 months ago
- A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models☆51Updated 2 months ago
- [EMNLP 2024] A Retrieval Benchmark for Scientific Literature Search☆84Updated 5 months ago
- Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments (EMNLP'2024)☆36Updated 4 months ago
- ☆46Updated 2 months ago
- ☆120Updated 7 months ago
- Learning to Retrieve by Trying - Source code for Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval☆33Updated 6 months ago
- Codebase accompanying the Summary of a Haystack paper.☆77Updated 7 months ago
- Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆90Updated 2 months ago
- B-STAR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners☆80Updated last month
- accompanying material for sleep-time compute paper☆77Updated last week
- Implementation of "SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models"☆27Updated 2 months ago
- General Reasoner: Advancing LLM Reasoning Across All Domains☆77Updated this week
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆54Updated last year
- ☆43Updated 8 months ago
- ☆62Updated last month