google / curieLinks
Code release for "CURIE: Evaluating LLMs On Multitask Scientific Long Context Understanding and Reasoning", ICLR 2025
☆23Updated 2 months ago
Alternatives and similar repositories for curie
Users that are interested in curie are comparing it to the libraries listed below
Sorting:
- [preprint] PiFlow: Principle-aware Scientific Discovery with Multi-Agent Collaboration☆15Updated 2 weeks ago
- implementation of dualformer☆17Updated 3 months ago
- [ACL 2024] <Large Language Models for Automated Open-domain Scientific Hypotheses Discovery>. It has also received the best poster award …☆42Updated 8 months ago
- [ICLR 2025] <MOOSE-Chem: Large Language Models for Rediscovering Unseen Chemistry Scientific Hypotheses>☆43Updated last week
- Structured Chemistry Reasoning with Large Language Models☆39Updated last year
- [ICLR'25] "Attention in Large Language Models Yields Efficient Zero-Shot Re-Rankers"☆23Updated 2 months ago
- [ICLR 2025]ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning☆57Updated 3 months ago
- A testbed for agents and environments that can automatically improve models through data generation.☆24Updated 3 months ago
- Official Implementation of the Baby-AIGS system☆23Updated 7 months ago
- Official Implementation of UA^{2}-Agent and other baseline algorithms of "Towards Unified Alignment Between Agents, Humans, and Environme…☆17Updated 7 months ago
- MMSci: A Multimodal Multi-Discipline Dataset for PhD-Level Scientific Comprehension☆45Updated 6 months ago
- Official repository for the paper Number Cookbook: Number Understanding of Language Models and How to Improve It.☆16Updated 2 months ago
- Code release for "SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers" [NeurIPS D&B, 2024]☆59Updated 5 months ago
- A collection of resources and papers on AI Scientist / Robot Scientist☆73Updated 3 weeks ago
- ☆32Updated 5 months ago
- ☆54Updated this week
- Pre-trained Language Model for Scientific Text☆45Updated last year
- [ACL 2025] Are Your LLMs Capable of Stable Reasoning?☆25Updated 3 months ago
- Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers☆19Updated 3 months ago
- SciKnowEval: Evaluating Multi-level Scientific Knowledge of Large Language Models☆19Updated 7 months ago
- MathFusion: Enhancing Mathematical Problem-solving of LLM through Instruction Fusion (ACL 2025)☆25Updated last month
- One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation☆40Updated 8 months ago
- Reasoning Agentic Retrieval-Augmented Generation for Industry Challenges☆14Updated last month
- Official Code Repository for EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents (COLM 2024)☆34Updated 11 months ago
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆34Updated last year
- [ICLR'25] ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery☆89Updated 3 weeks ago
- PyTorch codes for the paper "An Empirical Study of Multimodal Model Merging"☆37Updated last year
- Official implementation of "BERTs are Generative In-Context Learners"☆28Updated 3 months ago
- Make reasoning models scalable☆37Updated 3 weeks ago
- Official implementation for "Law of the Weakest Link: Cross capabilities of Large Language Models"☆42Updated 8 months ago