google / curieLinks
Code release for "CURIE: Evaluating LLMs On Multitask Scientific Long Context Understanding and Reasoning", ICLR 2025
☆28Updated 5 months ago
Alternatives and similar repositories for curie
Users that are interested in curie are comparing it to the libraries listed below
Sorting:
- AIRA-dojo: a framework for developing and evaluating AI research agents☆101Updated 3 weeks ago
- implementation of dualformer☆21Updated 7 months ago
- Trust Region Preference Approximation: A simple and stable reinforcement learning algorithm for LLM reasoning☆13Updated 3 months ago
- A collection of resources and papers on AI Scientist / Robot Scientist☆101Updated 2 weeks ago
- Code release for "SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers" [NeurIPS D&B, 2024]☆64Updated 9 months ago
- [ICLR 2025]ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning https://arxiv.org/abs/2501.06590☆72Updated 2 months ago
- SSRL: Self-Search Reinforcement Learning☆147Updated 2 months ago
- Esoteric Language Models☆101Updated last week
- Official repository for the paper Number Cookbook: Number Understanding of Language Models and How to Improve It.☆17Updated 6 months ago
- Official PyTorch implementation and models for paper "Diffusion Beats Autoregressive in Data-Constrained Settings". We find diffusion mod…☆101Updated last month
- MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning☆104Updated this week
- ☆76Updated last month
- [ICML 2025] Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction☆72Updated 4 months ago
- Official Implementation of UA^{2}-Agent and other baseline algorithms of "Towards Unified Alignment Between Agents, Humans, and Environme…☆19Updated 11 months ago
- [ACL 2024] <Large Language Models for Automated Open-domain Scientific Hypotheses Discovery>. It has also received the best poster award …☆42Updated 11 months ago
- ☆30Updated 5 months ago
- Official Implementation of the Baby-AIGS system☆23Updated 10 months ago
- Reinforcing General Reasoning without Verifiers☆90Updated 3 months ago
- SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning☆156Updated last month
- Structured Chemistry Reasoning with Large Language Models☆38Updated last year
- ☆35Updated 5 months ago
- [ICLR'25] ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery☆104Updated last month
- Official implementation of Regularized Policy Gradient (RPG) (https://arxiv.org/abs/2505.17508)☆51Updated last week
- ☆51Updated 7 months ago
- [ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework☆71Updated 4 months ago
- ☆33Updated 9 months ago
- MathFusion: Enhancing Mathematical Problem-solving of LLM through Instruction Fusion (ACL 2025)☆30Updated 3 months ago
- Exploration of automated dataset selection approaches at large scales.☆47Updated 7 months ago
- ☆218Updated 7 months ago
- ☆40Updated 4 months ago