☆35Nov 17, 2021Updated 4 years ago
Alternatives and similar repositories for lm-calibration
Users that are interested in lm-calibration are comparing it to the libraries listed below
Sorting:
- MetricEval: A framework that conceptualizes and operationalizes four main components of metric evaluation, in terms of reliability and va…☆12Nov 6, 2023Updated 2 years ago
- Open-WikiTable :Dataset for Open Domain Question Answering with Complex Reasoning over Table☆27Jun 2, 2023Updated 2 years ago
- Code for EMNLP 2022 Paper: On the Calibration of Massively Multilingual Language Models☆15Jun 12, 2023Updated 2 years ago
- Code and datasets for the EMNLP 2020 paper "Calibration of Pre-trained Transformers"☆60Jun 12, 2023Updated 2 years ago
- Code for "End-to-End Learning of Flowchart Grounded Task-Oriented Dialogs"☆14Oct 10, 2022Updated 3 years ago
- ☆15Nov 17, 2020Updated 5 years ago
- ☆22Jan 5, 2024Updated 2 years ago
- ☆17Dec 21, 2023Updated 2 years ago
- Source codes for "Preference-grounded Token-level Guidance for Language Model Fine-tuning" (NeurIPS 2023).☆17Jan 8, 2025Updated last year
- Exploring limitations of LLM-as-a-judge☆20Aug 17, 2024Updated last year
- A heterogeneous entity-augmented academic language model based on Open Academic Graph (OAG)☆83Oct 31, 2024Updated last year
- EMNLP'2022: BERTScore is Unfair: On Social Bias in Language Model-Based Metrics for Text Generation☆41Oct 19, 2022Updated 3 years ago
- ☆20Jun 7, 2020Updated 5 years ago
- The project page for "SCITAB: A Challenging Benchmark for Compositional Reasoning and Claim Verification on Scientific Tables"☆23Dec 21, 2023Updated 2 years ago
- ☆28Feb 11, 2026Updated 3 weeks ago
- ☆22Feb 26, 2024Updated 2 years ago
- Code for "DocLens: Multi-aspect Fine-grained Evaluation for Medical Text Generation" (ACL 2024)☆22May 18, 2024Updated last year
- Official Github repo for the paper "Evaluating the Evaluation of Diversity in Natural Language Generation"☆20Feb 23, 2021Updated 5 years ago
- ☆22Aug 10, 2022Updated 3 years ago
- [NeurIPS 2022] Non-Linguistic Supervision for Contrastive Learning of Sentence Embeddings☆22Jan 30, 2023Updated 3 years ago
- Momentum Decoding: Open-ended Text Generation as Graph Exploration☆19Jan 27, 2023Updated 3 years ago
- AASC: ACL Anthology Sentence Corpus☆20Oct 28, 2020Updated 5 years ago
- ☆24Jun 12, 2023Updated 2 years ago
- The official code repository for MetricMT - a reward optimization method for NMT with learned metrics☆25Apr 24, 2021Updated 4 years ago
- Awesome LLM for NLG Evaluation Papers☆25Jan 23, 2024Updated 2 years ago
- ☆30May 20, 2022Updated 3 years ago
- ☆38Jul 24, 2025Updated 7 months ago
- ☆25Oct 22, 2022Updated 3 years ago
- EMNLP'22 | PromptEHR: Conditional Electronic Healthcare Records Generation with Prompt Learning☆32Jun 8, 2023Updated 2 years ago
- A2T: Towards Improving Adversarial Training of NLP Models (EMNLP 2021 Findings)☆27Sep 12, 2021Updated 4 years ago
- Code and resources for evaluating cross-lingual embedding spaces☆29Apr 7, 2020Updated 5 years ago
- KETOD Knowledge-Enriched Task-Oriented Dialogue☆32Jan 4, 2023Updated 3 years ago
- DiscoScore: Evaluating Text Generation with BERT and Discourse Coherence☆36Jul 25, 2023Updated 2 years ago
- NILE : Natural Language Inference with Faithful Natural Language Explanations☆30Jun 12, 2023Updated 2 years ago
- Offical code of the paper Large Language Models Are Implicitly Topic Models: Explaining and Finding Good Demonstrations for In-Context Le…☆75Mar 20, 2024Updated last year
- Code for "Variational Template Machine for Data-to-text generation"☆31Jul 17, 2020Updated 5 years ago
- The Stanford Word Substitution (Swords) Benchmark☆33Mar 24, 2022Updated 3 years ago
- ☆14Aug 14, 2024Updated last year
- Official Github repo for the paper "Unifying Human and Statistical Evaluation for Natural Language Generation"☆74Mar 26, 2019Updated 6 years ago