google / curieLinks
Code release for "CURIE: Evaluating LLMs On Multitask Scientific Long Context Understanding and Reasoning", ICLR 2025
☆28Updated 6 months ago
Alternatives and similar repositories for curie
Users that are interested in curie are comparing it to the libraries listed below
Sorting:
- implementation of dualformer☆24Updated 8 months ago
- [ICLR 2025]ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning https://arxiv.org/abs/2501.06590☆72Updated 3 months ago
- [ICLR 2025] Code for the paper "Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning"☆78Updated 8 months ago
- ☆281Updated 3 weeks ago
- MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning☆107Updated last week
- ☆33Updated 10 months ago
- SSRL: Self-Search Reinforcement Learning☆149Updated 2 months ago
- SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning☆161Updated last month
- AIRA-dojo: a framework for developing and evaluating AI research agents☆106Updated last month
- Esoteric Language Models☆104Updated last month
- Official PyTorch implementation and models for paper "Diffusion Beats Autoregressive in Data-Constrained Settings". We find diffusion mod…☆103Updated last week
- Trust Region Preference Approximation: A simple and stable reinforcement learning algorithm for LLM reasoning☆13Updated 4 months ago
- Defeating the Training-Inference Mismatch via FP16☆56Updated last week
- Official Code for Paper "Think While You Generate: Discrete Diffusion with Planned Denoising" [ICLR 2025]☆82Updated 6 months ago
- A testbed for agents and environments that can automatically improve models through data generation.☆27Updated 8 months ago
- [NeurIPS 2024] Can LLMs Learn by Teaching for Better Reasoning? A Preliminary Study☆55Updated 11 months ago
- Demystifying Reinforcement Learning in Agentic Reasoning☆111Updated 3 weeks ago
- ☆105Updated this week
- A collection of resources and papers on AI Scientist / Robot Scientist☆104Updated last month
- ☆35Updated 5 months ago
- [ICML 2025] Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction☆72Updated 5 months ago
- MathFusion: Enhancing Mathematical Problem-solving of LLM through Instruction Fusion (ACL 2025)☆32Updated 3 months ago
- Official implementation of Regularized Policy Gradient (RPG) (https://arxiv.org/abs/2505.17508)☆53Updated 3 weeks ago
- [ICML2025 Oral] LLM-SRBench: A New Benchmark for Scientific Equation Discovery with Large Language Models☆81Updated 3 months ago
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"☆84Updated last year
- Official Implementation of the Baby-AIGS system☆23Updated 11 months ago
- ☆77Updated last week
- The official github repo for "Diffusion Language Models are Super Data Learners".☆145Updated this week
- ☆221Updated 8 months ago
- A benchmark that challenges language models to code solutions for scientific problems☆153Updated this week