chchenhui / mlrbenchLinks
MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research
☆20Updated 3 months ago
Alternatives and similar repositories for mlrbench
Users that are interested in mlrbench are comparing it to the libraries listed below
Sorting:
- Extending context length of visual language models☆12Updated last year
- ☆21Updated 8 months ago
- From Accuracy to Robustness: A Study of Rule- and Model-based Verifiers in Mathematical Reasoning.☆24Updated 3 months ago
- [ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning☆70Updated 5 months ago
- Optimizing Anytime Reasoning via Budget Relative Policy Optimization☆51Updated 5 months ago
- [ACL'25 (Findings)] Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents☆25Updated 2 months ago
- ☆19Updated 8 months ago
- instruction-following benchmark for large reasoning models☆44Updated 5 months ago
- Official Repo for SvS: A Self-play with Variational Problem Synthesis strategy for RLVR training☆50Updated 3 weeks ago