ZhaofengWu / counterfactual-evaluationLinks

☆57

Alternatives and similar repositories for counterfactual-evaluation

Users that are interested in counterfactual-evaluation are comparing it to the libraries listed below

Sorting:

xlang-ai / icl-selective-annotation
[ICLR 2023] Code for our paper "Selective Annotation Makes Language Models Better Few-Shot Learners"
☆111Updated 2 years ago
NoviScl / GPT3-Reliability
☆79Updated 2 years ago
google-research-datasets / GSM-IC
Grade-School Math with Irrelevant Context (GSM-IC) benchmark is an arithmetic reasoning dataset built upon GSM8K, by adding irrelevant se…
☆64Updated 2 years ago
YuxiXie / SelfEval-Guided-Decoding
☆103Updated last year
hkust-nlp / felm
Github repository for "FELM: Benchmarking Factuality Evaluation of Large Language Models" (NeurIPS 2023)
☆61Updated last year
Alrope123 / rethinking-demonstrations
☆177Updated last year
FranxYao / FlanT5-CoT-Specialization
Implementation of ICML 23 Paper: Specializing Smaller Language Models towards Multi-Step Reasoning.
☆132Updated 2 years ago
Nanami18 / Snowballed_Hallucination
☆44Updated last year
OhadRubin / EPR
☆64Updated 3 years ago
sunlab-osu / Understanding-CoT
☆88Updated 2 years ago
nkandpa2 / long_tail_knowledge
Repo for the paper "Large Language Models Struggle to Learn Long-Tail Knowledge"
☆78Updated 2 years ago
evandez / REMEDI
Inspecting and Editing Knowledge Representations in Language Models
☆119Updated 2 years ago
GXimingLu / Quark
☆75Updated 2 years ago
cicl-stanford / procedural-evals-tom
☆35Updated 2 years ago
HKUNLP / icl-ceil
[ICML 2023] Code for our paper “Compositional Exemplars for In-context Learning”.
☆103Updated 2 years ago
nayeon7lee / FactualityPrompt
☆87Updated 3 years ago
liujch1998 / rainier
☆28Updated last year
GAIR-NLP / alignment-for-honesty
☆76Updated last year
OSU-NLP-Group / llm-planning-eval
[ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"
☆54Updated last year
HKUNLP / ProGen
[EMNLP-2022 Findings] Code for paper “ProGen: Progressive Zero-shot Dataset Generation via In-context Feedback”.
☆27Updated 2 years ago
wzhouad / context-faithful-llm
Code and data for paper "Context-faithful Prompting for Large Language Models".
☆41Updated 2 years ago
archiki / ReCEval
Supporting code for ReCEval paper
☆30Updated last year
lupantech / PromptPG
Data and code for the ICLR 2023 paper "Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning".
☆163Updated last year
dannyallover / overthinking_the_truth
☆29Updated last year
swj0419 / in-context-pretraining
☆54Updated last year
FreedomIntelligence / OVM
☆68Updated last year
chaochun / nlu-asdiv-dataset
☆50Updated 2 years ago
asaparov / prontoqa
Synthetic question-answering dataset to formally analyze the chain-of-thought output of large language models on a reasoning task.
☆154Updated 2 months ago
sail-sg / symbolic-instruction-tuning
The official repository for the paper "From Zero to Hero: Examining the Power of Symbolic Tasks in Instruction Tuning".
☆66Updated 2 years ago
eric-mitchell / serac
Semi-Parametric Editing with a Retrieval-Augmented Counterfactual Model
☆70Updated 3 years ago