The official repository for our EMNLP 2024 paper, Themis: A Reference-free NLG Evaluation Language Model with Flexibility and Interpretability.
☆20Feb 23, 2025Updated last year
Alternatives and similar repositories for Themis
Users that are interested in Themis are comparing it to the libraries listed below
Sorting:
- HANNA, a large annotated dataset of Human-ANnotated NArratives for ASG evaluation.☆35Oct 15, 2024Updated last year
- [AAAI 2024] History Matters: Temporal Knowledge Editing in Large Language Model☆14Dec 17, 2023Updated 2 years ago
- ☆25May 16, 2024Updated last year
- FBI: Finding Blindspots in LLM Evaluations with Interpretable Checklists☆31Aug 14, 2025Updated 6 months ago
- Evaluate the Quality of Critique☆36Jun 1, 2024Updated last year
- ☆12Sep 21, 2023Updated 2 years ago
- Official implementation of the paper "On the Importance of Environments in Human-Robot Coordination", published in RSS 2021.☆16May 1, 2024Updated last year
- 🎹🎵🎶 A platform to make Original and Cover Visible and Valuable.☆13Nov 8, 2022Updated 3 years ago
- ☆16Jun 25, 2025Updated 8 months ago
- ☆47Mar 25, 2025Updated 11 months ago
- ☆11Oct 8, 2023Updated 2 years ago
- A pipeline for phylogenetic diversity analysis of GBIF-mediated data☆13May 30, 2025Updated 9 months ago
- Benchmarks for Business Document Foundation Models☆10Apr 4, 2024Updated last year
- SIGIR 2021: Proactive Retrieval-based Chatbots based on Relevant Knowledge and Goals☆11Jul 30, 2021Updated 4 years ago
- Tools for Natural Language Processing☆12Feb 16, 2018Updated 8 years ago
- ☆12Nov 1, 2023Updated 2 years ago
- search-rattailcollagen1 created by GitHub Classroom☆10Jan 17, 2021Updated 5 years ago
- ☆10May 27, 2024Updated last year
- Repository of paper "Establishing Trustworthy LLM Evaluation via Shortcut Neuron Analysis" (ACL 2025 Main)☆19Jul 19, 2025Updated 7 months ago
- 基于scrapy的中国大学MOOC爬虫☆10Jul 29, 2022Updated 3 years ago
- ACL24☆11Jun 7, 2024Updated last year
- Tools to estimate the correlation of different text-based evaluation measures for Automatic Image Description☆10Feb 2, 2017Updated 9 years ago
- V2 of CodeGraphy. VSCode force-based graph extension for displaying file connections☆13Jun 10, 2023Updated 2 years ago
- AgentsCourt: Building Judicial Decision-Making Agents with Court Debate Simulation and Legal Knowledge Augmentation (EMNLP 2024 Findings)☆15Dec 30, 2024Updated last year
- To mitigate position bias in LLMs, especially in long-context scenarios, we scale only one dimension of LLMs, reducing position bias and …☆11Jun 18, 2024Updated last year
- Repo for MGraph project☆13Jan 10, 2026Updated last month
- The repo for using the model https://huggingface.co/thu-coai/Attacker-v0.1☆13Apr 23, 2025Updated 10 months ago
- ☆13Jun 5, 2024Updated last year
- ☆14Aug 30, 2023Updated 2 years ago
- Distinguishing between anime and hentai☆16Jan 29, 2017Updated 9 years ago
- ☆10Apr 16, 2019Updated 6 years ago
- ☆14Nov 23, 2023Updated 2 years ago
- [ICML 2024] "Envisioning Outlier Exposure by Large Language Models for Out-of-Distribution Detection"☆14Feb 15, 2025Updated last year
- Code for the Explained Series for the Very Normal Youtube channel☆19Jul 19, 2024Updated last year
- ☆14May 19, 2025Updated 9 months ago
- ☆11Nov 21, 2024Updated last year
- Library for simulating time progression in Python☆16Aug 16, 2025Updated 6 months ago
- Natural Language to Code☆14May 2, 2021Updated 4 years ago
- Faster, more accurate and entirely open source method for predicting contacts in proteins☆12May 21, 2018Updated 7 years ago