OSU-NLP-Group / ScienceAgentBench
ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
☆20Updated last week
Related projects ⓘ
Alternatives and complementary repositories for ScienceAgentBench
- ☆40Updated last month
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆30Updated 9 months ago
- OLAPH: Improving Factuality in Biomedical Long-form Question Answering☆38Updated 2 months ago
- DSBench: How Far are Data Science Agents from Becoming Data Science Experts?☆35Updated last month
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"☆75Updated last month
- ☆41Updated last month
- This repository contains the code for the paper: SirLLM: Streaming Infinite Retentive LLM☆56Updated 5 months ago
- ☆41Updated 2 weeks ago
- Open-source Python toolkit focused on deep learning with ordinal methodologies☆31Updated 3 weeks ago
- ☆42Updated 2 months ago
- The first dense retrieval model that can be prompted like an LM☆63Updated 2 months ago
- Lean implementation of various multi-agent LLM methods, including Iteration of Thought (IoT)☆53Updated 2 months ago
- The Open Source Code for LLM4SD (Large Language Models for Scientific Synthesis, Inference and Explanation)☆30Updated 3 weeks ago
- A MULTI-GENERATOR ENSEMBLE FRAMEWORK FOR NATURAL LANGUAGE TO SQL☆46Updated last week
- ☆22Updated 2 months ago
- ☆37Updated 3 weeks ago
- Official implementation of the ACL 2024: Scientific Inspiration Machines Optimized for Novelty☆68Updated 7 months ago
- EMNLP 2024 "Re-reading improves reasoning in large language models". Simply repeating the question to get bidirectional understanding for…☆20Updated last week
- ☆59Updated last month
- [EMNLP 2024] A Retrieval Benchmark for Scientific Literature Search☆61Updated 4 months ago
- [ACL 2024] <Large Language Models for Automated Open-domain Scientific Hypotheses Discovery>. It has also received the best poster award …☆36Updated 3 weeks ago
- ☆31Updated last month
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆46Updated 2 months ago
- A comprehensive repository of reasoning tasks for Medical LLMs (and beyond)☆96Updated 2 months ago
- Code and data releases for the paper -- DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memory☆32Updated 3 weeks ago
- ☆56Updated 3 weeks ago
- ☆78Updated 11 months ago
- Source code for our paper: "SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals".☆65Updated 4 months ago
- Repository for paper Tools Are Instrumental for Language Agents in Complex Environments☆32Updated last month
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆68Updated last month