eric11eca / disco
Generating diverse counterfactual data for Natural Language Understanding tasks using Large Language Models (LLMs). The generator supports GPT-3, ChatGPT, and GPT-4.
☆35Updated last year
Alternatives and similar repositories for disco:
Users that are interested in disco are comparing it to the libraries listed below
- Supporting code for ReCEval paper☆28Updated 5 months ago
- Grade-School Math with Irrelevant Context (GSM-IC) benchmark is an arithmetic reasoning dataset built upon GSM8K, by adding irrelevant se…☆58Updated 2 years ago
- ☆44Updated 6 months ago
- Evaluate the Quality of Critique☆35Updated 9 months ago
- [ACL 2023 Findings] What In-Context Learning “Learns” In-Context: Disentangling Task Recognition and Task Learning☆21Updated last year
- ☆85Updated last year
- AbstainQA, ACL 2024☆25Updated 4 months ago
- [EMNLP 2023] Knowledge Rumination for Pre-trained Language Models☆16Updated last year
- Code and data for paper "Context-faithful Prompting for Large Language Models".☆39Updated last year
- [EMNLP-2022 Findings] Code for paper “ProGen: Progressive Zero-shot Dataset Generation via In-context Feedback”.☆26Updated 2 years ago
- ☆49Updated last year
- ☆173Updated 7 months ago
- "TIGERScore: Towards Building Explainable Metric for All Text Generation Tasks" [TMLR 2024]☆31Updated 2 months ago
- Personality Alignment of Language Models☆21Updated this week
- Code for reproducing the ACL'23 paper: Don't Generate, Discriminate: A Proposal for Grounding Language Models to Real-World Environments☆73Updated last year
- Code for the arXiv paper: "LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond"☆59Updated last month
- [ICLR'24 Spotlight] "Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts"☆66Updated 10 months ago
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆34Updated last year
- ☆50Updated last year
- ☆41Updated last year
- Data and code for ACL 2022 paper "MultiHiertt: Numerical Reasoning over Multi Hierarchical Tabular and Textual Data"☆42Updated 4 months ago
- Implementation of the paper: "Making Retrieval-Augmented Language Models Robust to Irrelevant Context"☆66Updated 6 months ago
- Evaluation on Logical Reasoning and Abstract Reasoning Challenges☆23Updated last year
- WikiWhy is a new benchmark for evaluating LLMs' ability to explain between cause-effect relationships. It is a QA dataset containing 9000…☆47Updated last year
- Benchmarking Generalization to New Tasks from Natural Language Instructions☆26Updated 3 years ago
- Technical Report: Is ChatGPT a Good NLG Evaluator? A Preliminary Study☆43Updated last year
- ☆47Updated 10 months ago
- ☆39Updated 6 months ago
- the instructions and demonstrations for building a formal logical reasoning capable GLM☆53Updated 6 months ago
- Code for our paper: "GrIPS: Gradient-free, Edit-based Instruction Search for Prompting Large Language Models"☆53Updated last year