allenai / marg-reviewerLinks
Code/data for MARG (multi-agent review generation)
☆44Updated 8 months ago
Alternatives and similar repositories for marg-reviewer
Users that are interested in marg-reviewer are comparing it to the libraries listed below
Sorting:
- [ACL 2024] <Large Language Models for Automated Open-domain Scientific Hypotheses Discovery>. It has also received the best poster award …☆42Updated 8 months ago
- ☆72Updated last year
- Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments (EMNLP'2024)☆37Updated 6 months ago
- Scalable Meta-Evaluation of LLMs as Evaluators☆42Updated last year
- [EMNLP 2024] A Retrieval Benchmark for Scientific Literature Search☆90Updated 7 months ago
- Implementation of the paper: "Answering Questions by Meta-Reasoning over Multiple Chains of Thought"☆96Updated last year
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆54Updated last year
- ☆124Updated 9 months ago
- Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators (Liu et al.; COLM 2024)☆47Updated 5 months ago
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆35Updated last year
- ☆43Updated 11 months ago
- ☆34Updated 8 months ago
- ☆52Updated last year
- ☆22Updated 7 months ago
- A dataset of LLM-generated chain-of-thought steps annotated with mistake location.☆81Updated 11 months ago
- Evaluate the Quality of Critique☆36Updated last year
- Official implementation of the ACL 2024: Scientific Inspiration Machines Optimized for Novelty☆81Updated last year
- ☆54Updated last year
- PASTA: Post-hoc Attention Steering for LLMs☆121Updated 7 months ago
- [NeurIPS 2023] This is the code for the paper `Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias`.☆150Updated last year
- Code for "From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Mod…☆37Updated last year
- Meta-CoT: Generalizable Chain-of-Thought Prompting in Mixed-task Scenarios with Large Language Models☆97Updated last year
- This repository contains ScholarQABench data and evaluation pipeline.☆73Updated 3 months ago
- This repository includes a benchmark and code for the paper "Evaluating LLMs at Detecting Errors in LLM Responses".☆30Updated 11 months ago
- ☆44Updated 7 months ago
- [ICML 2025] Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples☆101Updated last month
- FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions☆45Updated last year
- Code for EMNLP 2024 paper "Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning"☆55Updated 9 months ago
- [ACL 2024] Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology View☆118Updated last month
- Source code for our paper: "Put Your Money Where Your Mouth Is: Evaluating Strategic Planning and Execution of LLM Agents in an Auction A…☆45Updated last year