OSU-NLP-Group / ScienceAgentBench
ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
☆19Updated last week
Related projects ⓘ
Alternatives and complementary repositories for ScienceAgentBench
- ☆39Updated 3 weeks ago
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆30Updated 9 months ago
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"☆75Updated 3 weeks ago
- ☆41Updated 2 months ago
- ☆40Updated this week
- The first dense retrieval model that can be prompted like an LM☆62Updated last month
- ☆31Updated 2 weeks ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆46Updated 2 months ago
- This repository contains the code for the paper: SirLLM: Streaming Infinite Retentive LLM☆55Updated 5 months ago
- Open-source Python toolkit focused on deep learning with ordinal methodologies☆31Updated last week
- Flow of Reasoning: Training LLMs for Divergent Problem Solving with Minimal Examples☆38Updated last month
- Code and data releases for the paper -- DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memory☆30Updated 2 weeks ago
- ☆57Updated last month
- DSBench: How Far are Data Science Agents from Becoming Data Science Experts?☆34Updated 3 weeks ago
- ☆30Updated last month
- ☆25Updated 2 months ago
- ☆50Updated 2 weeks ago
- Official repository for Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning☆32Updated 3 weeks ago
- [EMNLP 2024] A Retrieval Benchmark for Scientific Literature Search☆61Updated 4 months ago
- Code for EMNLP 2024 paper "Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning"☆45Updated last month
- Co-LLM: Learning to Decode Collaboratively with Multiple Language Models☆102Updated 6 months ago
- ☆49Updated 3 weeks ago
- ☆21Updated last month
- OLAPH: Improving Factuality in Biomedical Long-form Question Answering☆38Updated 2 months ago
- Discovering Data-driven Hypotheses in the Wild☆39Updated 2 weeks ago
- Using multiple LLMs for ensemble Forecasting☆16Updated 9 months ago
- ☆41Updated last month
- Official implementation for <Large Language Models for Automated Open-domain Scientific Hypotheses Discovery>, accepted by ACL 2024. It a…☆35Updated 2 weeks ago
- ☆36Updated last month
- Lean implementation of various multi-agent LLM methods, including Iteration of Thought (IoT)☆51Updated last month