AkariAsai / OpenScholar_ExpertEval
This repository contains expert evaluation interface and data evaluation script for the OpenScholar project.
☆23Updated 3 months ago
Alternatives and similar repositories for OpenScholar_ExpertEval:
Users that are interested in OpenScholar_ExpertEval are comparing it to the libraries listed below
- This repository contains ScholarQABench data and evaluation pipeline.☆61Updated 2 weeks ago
- ReBase: Training Task Experts through Retrieval Based Distillation☆28Updated 2 weeks ago
- [SIGIR 2024 (Demo)] CoSearchAgent: A Lightweight Collborative Search Agent with Large Language Models☆22Updated last year
- Are LLMs Capable of Data-based Statistical and Causal Reasoning? Benchmarking Advanced Quantitative Reasoning with Data☆35Updated this week
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆54Updated 5 months ago
- [EMNLP 2024] A Retrieval Benchmark for Scientific Literature Search☆69Updated 2 months ago
- The first dense retrieval model that can be prompted like an LM☆64Updated 5 months ago
- Code for "Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free"☆44Updated 4 months ago
- Codebase accompanying the Summary of a Haystack paper.☆74Updated 5 months ago
- ☆19Updated 4 months ago
- ☆62Updated 7 months ago
- Codes and datasets for the paper Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Ref…☆43Updated last week
- DSBench: How Far are Data Science Agents from Becoming Data Science Experts?☆43Updated this week
- Scalable Meta-Evaluation of LLMs as Evaluators☆43Updated last year
- ☆66Updated last year
- Code and Data for "Language Modeling with Editable External Knowledge"☆31Updated 8 months ago
- Improving Text Embedding of Language Models Using Contrastive Fine-tuning☆59Updated 6 months ago
- A testbed for agents and environments that can automatically improve models through data generation.☆18Updated 2 months ago
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆76Updated 4 months ago
- ☆60Updated last week
- An unofficial implementation of SOLAR-10.7B model and the newly proposed interlocked-DUS(iDUS) implementation and experiment details.☆12Updated 11 months ago
- ☆22Updated 2 weeks ago
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆48Updated 2 months ago
- ☆22Updated 2 months ago
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆31Updated last year
- ☆54Updated 5 months ago
- [ICLR 2025] InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales☆72Updated 2 weeks ago