AkariAsai / OpenScholar_ExpertEvalLinks
This repository contains expert evaluation interface and data evaluation script for the OpenScholar project.
☆28Updated 10 months ago
Alternatives and similar repositories for OpenScholar_ExpertEval
Users that are interested in OpenScholar_ExpertEval are comparing it to the libraries listed below
Sorting:
- This repository contains ScholarQABench data and evaluation pipeline.☆85Updated 2 months ago
- ☆40Updated 4 months ago
- Analysis code for Neurips 2025 paper "SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks"☆52Updated 2 months ago
- ☆67Updated 6 months ago
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆96Updated this week
- Codebase accompanying the Summary of a Haystack paper.☆79Updated last year
- Source code for the collaborative reasoner research project at Meta FAIR.☆102Updated 5 months ago
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆75Updated 11 months ago
- [EMNLP 2024] A Retrieval Benchmark for Scientific Literature Search☆98Updated 10 months ago
- Dataset and evaluation suite enabling LLM instruction-following for scientific literature understanding.☆42Updated 6 months ago
- SiriuS: Self-improving Multi-agent Systems via Bootstrapped Reasoning☆69Updated 3 months ago
- ☆40Updated 9 months ago
- Leveraging Base Language Models for Few-Shot Synthetic Data Generation☆35Updated 2 months ago
- Discovering Data-driven Hypotheses in the Wild☆113Updated 4 months ago
- ☆34Updated 4 months ago
- [ICLR'25] ApolloMoE: Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts☆49Updated 10 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆60Updated last year
- Jina VDR is a multilingual, multi-domain benchmark for visual document retrieval☆30Updated 2 months ago
- [ICLR'25] ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery☆103Updated last month
- SSRL: Self-Search Reinforcement Learning☆145Updated last month
- ☆22Updated 7 months ago
- Verifiers for LLM Reinforcement Learning☆74Updated 5 months ago
- ReBase: Training Task Experts through Retrieval Based Distillation☆29Updated 8 months ago
- The first dense retrieval model that can be prompted like an LM☆89Updated 5 months ago
- ☆50Updated last year
- Aioli: A unified optimization framework for language model data mixing☆27Updated 8 months ago
- ☆55Updated 11 months ago
- ☆62Updated last year
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"☆130Updated 11 months ago
- ☆24Updated 2 months ago