OAfzal / nlp-for-peer-review
☆36Updated 4 months ago
Alternatives and similar repositories for nlp-for-peer-review:
Users that are interested in nlp-for-peer-review are comparing it to the libraries listed below
- Code/data for MARG (multi-agent review generation)☆41Updated 4 months ago
- ☆104Updated 10 months ago
- [ACL 2024] <Large Language Models for Automated Open-domain Scientific Hypotheses Discovery>. It has also received the best poster award …☆39Updated 5 months ago
- Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments (EMNLP'2024)☆36Updated 3 months ago
- Code for "From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Mod…☆35Updated last year
- Evaluate the Quality of Critique☆34Updated 9 months ago
- The source code for running LLMs on the AAAR-1.0 benchmark.☆16Updated this week
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆54Updated last year
- WikiWhy is a new benchmark for evaluating LLMs' ability to explain between cause-effect relationships. It is a QA dataset containing 9000…☆47Updated last year
- AbstainQA, ACL 2024☆25Updated 5 months ago
- ☆40Updated 11 months ago
- ☆41Updated last year
- Easy-to-use MIRAGE code for faithful answer attribution in RAG applications. Paper: https://aclanthology.org/2024.emnlp-main.347/☆21Updated 3 weeks ago
- ☆23Updated 2 months ago
- Code repository for the paper "Mission: Impossible Language Models."☆50Updated last week
- ☆23Updated last year
- This repository contains data, code and models for contextual noncompliance.☆20Updated 8 months ago
- This repository contains the dataset and code for "WiCE: Real-World Entailment for Claims in Wikipedia" in EMNLP 2023.☆41Updated last year
- ☆22Updated last year
- Evaluating the Moral Beliefs Encoded in LLMs☆24Updated 3 months ago
- ☆78Updated 2 years ago
- Supporting code for ReCEval paper☆28Updated 6 months ago
- ☆155Updated 4 months ago
- Tasks for describing differences between text distributions.☆16Updated 7 months ago
- ☆47Updated last year
- Dataset and evaluation suite enabling LLM instruction-following for scientific literature understanding.☆37Updated 2 weeks ago
- Grade-School Math with Irrelevant Context (GSM-IC) benchmark is an arithmetic reasoning dataset built upon GSM8K, by adding irrelevant se…☆58Updated 2 years ago
- Data and code for the preprint "In-Context Learning with Long-Context Models: An In-Depth Exploration"☆34Updated 7 months ago
- ☆34Updated 3 years ago
- [NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering☆52Updated 4 months ago