☆30Nov 14, 2025Updated 4 months ago
Alternatives and similar repositories for tower-eval
Users that are interested in tower-eval are comparing it to the libraries listed below
Sorting:
- (NAACL 2024) Guiding Large Language Models to Post-Edit Machine Translation with Error Annotations☆15Apr 14, 2025Updated 11 months ago
- ☆13Aug 23, 2024Updated last year
- ☆11Sep 25, 2025Updated 5 months ago
- Minimum Bayes Risk Decoding for Hugging Face Transformers☆60Jun 3, 2024Updated last year
- Find informative examples to efficiently (human)-evaluate NLG models.☆18Feb 27, 2026Updated 3 weeks ago
- Tools for evaluating the performance of MT metrics on data from recent WMT metrics shared tasks.☆127Oct 13, 2025Updated 5 months ago
- Code and data for the paper "Disentangling Uncertainty in Machine Translation Evaluation", accepted at EMNLP 2022.☆23Jun 23, 2023Updated 2 years ago
- Repository for "Uncertainty-Aware Machine Translation Evaluation", accepted to Findings of EMNLP 2021.☆34Sep 22, 2021Updated 4 years ago
- ☆15Apr 12, 2021Updated 4 years ago
- \infty-Video: A Training-Free Approach to Long Video Understanding via Continuous-Time Memory Consolidation☆19Feb 14, 2025Updated last year
- Sampling-Based Minimum Bayes-Risk Decoding for Neural Machine Translation☆16Oct 14, 2022Updated 3 years ago
- Repository for "BLEU Meets COMET: Combining Lexical and Neural Metrics Towards Robust Machine Translation Evaluation", accepted at EAMT 2…☆20Jul 19, 2023Updated 2 years ago
- Code and data for the IWSLT 2022 shared task on Formality Control for SLT☆22May 24, 2023Updated 2 years ago
- A library for minimum Bayes risk (MBR) decoding☆52Nov 2, 2025Updated 4 months ago
- Official PyTorch (Lightning) implementation of the NeurIPS 2020 paper "Efficient Marginalization of Discrete and Structured Latent Variab…☆27May 3, 2021Updated 4 years ago
- GEMBA — GPT Estimation Metric Based Assessment☆146Dec 15, 2025Updated 3 months ago
- Neural discourse structure for text categorization☆12Aug 27, 2017Updated 8 years ago
- PACIFIC: Towards Proactive Conversational Question Answering over Tabular and Textual Data in Finance☆14May 15, 2024Updated last year
- This repository provides open-source code for sparse continuous distributions and corresponding Fenchel-Young losses.☆15May 10, 2023Updated 2 years ago
- ☆36Mar 26, 2022Updated 3 years ago
- ☆134Jan 22, 2026Updated last month
- SocialDial: A Benchmark for Socially-Aware Dialogue Systems (SIGIR'23)☆16Aug 4, 2023Updated 2 years ago
- Using data from IBM Watson, descriptive and predictive analytics using Python and tableau☆12Dec 23, 2017Updated 8 years ago
- Repo for our AKBC-2021 paper: Abg-CoQA: Clarifying Ambiguity in Conversational Question Answering☆10Oct 10, 2021Updated 4 years ago
- Code for paper "Neural Semi-Markov Conditional Random Fields for Robust Character-Based Part-of-Speech Tagging"☆16May 31, 2019Updated 6 years ago
- ☆10May 31, 2021Updated 4 years ago
- Library for pruning experts per language pair in NLLB-200☆34Jul 7, 2023Updated 2 years ago
- ☆13Jul 13, 2018Updated 7 years ago
- State-of-the-art LLM-based translation models.☆582Apr 9, 2025Updated 11 months ago
- Author implementation of the paper "Don’t paraphrase, detect! Rapid and Effective Data Collection for Semantic Parsing"☆20Oct 5, 2020Updated 5 years ago
- Code for "End-to-End Learning of Flowchart Grounded Task-Oriented Dialogs"☆14Oct 10, 2022Updated 3 years ago
- LLM Agent that performs sentiment analysis of drawings and natural language using a combination of Google Gemini Vision model and GPT-4 T…☆13Dec 22, 2023Updated 2 years ago
- Torchreid-Pip: Packaged version of Torchreid☆14Oct 16, 2022Updated 3 years ago
- 🧠 Workshop Notebook and assets for the Anthropic Hackathon☆12Nov 4, 2023Updated 2 years ago
- ☆19Mar 12, 2025Updated last year
- [EMNLP 2024] RoTBench: A Multi-Level Benchmark for Evaluating the Robustness of Large Language Models in Tool Learning☆15May 13, 2025Updated 10 months ago
- A large Chinese sentiment lexicon consist of 8000 words☆24Oct 31, 2012Updated 13 years ago
- BERT score for text generation☆12Jan 15, 2025Updated last year
- 🤖 Code for our EMNLP 2022 paper: "BotsTalk: Machine-sourced Framework for Automatic Curation of Large-scale Multi-skill Dialogue Dataset…☆16Oct 7, 2024Updated last year