โ32Jul 11, 2024Updated last year
Alternatives and similar repositories for AutoBencher
Users that are interested in AutoBencher are comparing it to the libraries listed below
Sorting:
- ๐พ Universal, customizable and deployable fine-grained evaluation for text generation.โ24Oct 26, 2023Updated 2 years ago
- The implementation of <Factual Consistency Evaluation for Text Summarization via Counterfactual Estimation> in PyTorch.โ17Nov 11, 2021Updated 4 years ago
- Exploring limitations of LLM-as-a-judgeโ20Aug 17, 2024Updated last year
- Measuring if attention is explanation with ROARโ22Mar 3, 2023Updated 3 years ago
- โ23Mar 8, 2024Updated last year
- โ28Nov 16, 2025Updated 3 months ago
- Framework for unified summarisation and evaluation of English documents using state-of-the-art models and measures.โ33May 13, 2024Updated last year
- Pascal2 Harvest project QuEstโ14Sep 15, 2014Updated 11 years ago
- DEPRECATED - A DogStatsd Python clientโ16Dec 12, 2018Updated 7 years ago
- Implementing BERT + CRF with PyTorch for Chinese NER.โ10Mar 7, 2022Updated 3 years ago
- โ15Jun 28, 2023Updated 2 years ago
- โ12Aug 6, 2024Updated last year
- Base Docker image for deploying Kinesis Client Applications in Pythonโ10Nov 10, 2015Updated 10 years ago
- Repo for "On Learning to Summarize with Large Language Models as References"โ43May 24, 2023Updated 2 years ago
- Code listing for the paper 'SATAR: A Self-supervised Approach to Twitter Account Representation Learning and its Application in Bot Detecโฆโ10Nov 1, 2021Updated 4 years ago
- Repository of paper "Establishing Trustworthy LLM Evaluation via Shortcut Neuron Analysis" (ACL 2025 Main)โ19Jul 19, 2025Updated 7 months ago
- โ11Jan 3, 2024Updated 2 years ago
- ๐ค Implementation of Self Normalizing Networks (SNN) in PyTorch.โ12Jun 19, 2017Updated 8 years ago
- โ11Oct 11, 2023Updated 2 years ago
- [NAACL'25] "Revealing the Barriers of Language Agents in Planning"โ13Jun 22, 2025Updated 8 months ago
- โ16Dec 14, 2023Updated 2 years ago
- โ10Jun 28, 2020Updated 5 years ago
- Chessort is a Chess puzzle game where you sort moves based on the chess engine's evaluation.โ14Oct 30, 2024Updated last year
- Code for "Using Embeddings to Correct for Unobserved Confounding"โ10May 31, 2019Updated 6 years ago
- Official code for AAAI'20 paper "Merging Weak and Active Supervision for Semantic Parsing"โ11Dec 8, 2022Updated 3 years ago
- To mitigate position bias in LLMs, especially in long-context scenarios, we scale only one dimension of LLMs, reducing position bias and โฆโ11Jun 18, 2024Updated last year
- CliniDeID automatically de-identifies clinical text notes according to the HIPAA Safe Harbor method. It accurately finds identifiers and โฆโ10Aug 13, 2023Updated 2 years ago
- Fast IdEntification of State-of-The-Art models using adaptive bandit algorithmsโ14Jul 15, 2022Updated 3 years ago
- โ13Aug 14, 2022Updated 3 years ago
- โ11Dec 22, 2021Updated 4 years ago
- The contrastive token loss function for reducing generative repetition of autoregressive neural language models.โ13May 11, 2022Updated 3 years ago
- Codes for "Benchmarking the Generation of Fact Checking Explanations"โ10Aug 16, 2024Updated last year
- playing with gpt4โ14Mar 17, 2023Updated 2 years ago
- EMNLP 2022: Analyzing and Evaluating Faithfulness in Dialogue Summarizationโ13Mar 20, 2025Updated 11 months ago
- HPYLMใฎC++ๅฎ่ฃโ11May 2, 2017Updated 8 years ago
- Ranger helps you see the forest among the trees - Ranger is an effect-size meta analysis library creating beautiful forest plots!โ11Jun 12, 2023Updated 2 years ago
- [ACL2023] Source code for Dialogue Summarization with Static-Dynamic Structure Fusion Graphโ11Dec 17, 2023Updated 2 years ago
- Modified Beam Search with periodical restartโ12Sep 12, 2024Updated last year
- A common protocol for AI agent toolsโ10Oct 21, 2024Updated last year