chongyangtao / LLMs-for-NLG-EvaluationView external linksLinks
Awesome LLM for NLG Evaluation Papers
☆25Jan 23, 2024Updated 2 years ago
Alternatives and similar repositories for LLMs-for-NLG-Evaluation
Users that are interested in LLMs-for-NLG-Evaluation are comparing it to the libraries listed below
Sorting:
- MetricEval: A framework that conceptualizes and operationalizes four main components of metric evaluation, in terms of reliability and va…☆12Nov 6, 2023Updated 2 years ago
- ☆12Mar 22, 2024Updated last year
- ☆23Aug 1, 2024Updated last year
- ☆22Nov 23, 2023Updated 2 years ago
- Claude-router is a best project for using open model in claude-code☆55Sep 4, 2025Updated 5 months ago
- ☆61Sep 18, 2025Updated 4 months ago
- [ICLR 25 Oral] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style☆76Jul 18, 2025Updated 6 months ago
- ☆25Nov 24, 2023Updated 2 years ago
- KLUE Benchmark 1st place (2021.12) solutions. (RE, MRC, NLI, STS, TC)☆25Apr 11, 2022Updated 3 years ago
- A comprehensive paper list of Reasoning over Tables.☆30Nov 6, 2022Updated 3 years ago
- ☆35Nov 17, 2021Updated 4 years ago
- Authors' implementation of the paper Adaptive Information Seeking for Open-Domain Question Answering, published in EMNLP 2021.☆38May 16, 2023Updated 2 years ago
- Do Multilingual Language Models Think Better in English?☆42Aug 3, 2023Updated 2 years ago
- Test code of Inverse cloze task for information retrieval☆33Jan 10, 2021Updated 5 years ago
- Yet another python binding for mecab-ko☆88May 16, 2023Updated 2 years ago
- A review of class imbalanced problems using data augumentation and ensemble learning☆10Mar 15, 2023Updated 2 years ago
- Ansible for building kaggle environment☆13Jul 30, 2019Updated 6 years ago
- Embodied-Planner-R1: Unleashing Embodied Task Planning Ability in LLMs via Reinforcement Learning☆24Jan 5, 2026Updated last month
- ☆10May 19, 2024Updated last year
- Evaluation Pipeline for medical tasks.☆12Updated this week
- ☆92Mar 3, 2022Updated 3 years ago
- ☆13Sep 26, 2024Updated last year
- ☆12Nov 9, 2018Updated 7 years ago
- react-ts-antd-template☆10Mar 27, 2020Updated 5 years ago
- SIGIR 2021: Proactive Retrieval-based Chatbots based on Relevant Knowledge and Goals☆11Jul 30, 2021Updated 4 years ago
- ☆11Apr 10, 2023Updated 2 years ago
- Reasoning-based Evaluation and Ranking of Translations.☆19Jul 18, 2025Updated 6 months ago
- ☆10Dec 18, 2023Updated 2 years ago
- English and Chinese LaTeX template for reports/projects/proposal at Beijing Institute of Technology☆10Nov 19, 2020Updated 5 years ago
- 𝔸𝕄𝔹ℝ𝕆𝕊𝕀𝔸: A Benchmark for Parsing Ambiguous Questions into Database Queries☆14Oct 31, 2024Updated last year
- ☆11Nov 13, 2020Updated 5 years ago
- Help creating image dataset for machine learning.☆10Nov 4, 2020Updated 5 years ago
- Image segmentation using Gausian Markov Random Fields, and probability Maximization using ICM☆11Nov 6, 2015Updated 10 years ago
- EMNLP 2022: Analyzing and Evaluating Faithfulness in Dialogue Summarization☆13Mar 20, 2025Updated 10 months ago
- The contrastive token loss function for reducing generative repetition of autoregressive neural language models.☆13May 11, 2022Updated 3 years ago
- ☆10Oct 6, 2021Updated 4 years ago
- The official code of TACL 2022, "Break, Perturb, Build: Automatic Perturbation of Reasoning Paths Through Question Decomposition".☆11Oct 18, 2021Updated 4 years ago
- inductive reasoning benchmark with subregular hierarchy for string-to-string transformation☆14Jun 27, 2025Updated 7 months ago
- Reproduction Code for Paper "Investigating Multi-Hop Factual Shortcuts in Knowledge Editing of Large Language Models"☆13Jun 1, 2024Updated last year