β32Jul 11, 2024Updated last year
Alternatives and similar repositories for AutoBencher
Users that are interested in AutoBencher are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- πΎ Universal, customizable and deployable fine-grained evaluation for text generation.β24Apr 22, 2026Updated 2 weeks ago
- Improving Your Model Ranking on Chatbot Arena by Vote Rigging (ICML 2025)β27Feb 25, 2025Updated last year
- [ICLR'25] DataGen: Unified Synthetic Dataset Generation via Large Language Modelsβ67Mar 8, 2025Updated last year
- Exploring limitations of LLM-as-a-judgeβ20Aug 17, 2024Updated last year
- β23Mar 8, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- The implementation of <Factual Consistency Evaluation for Text Summarization via Counterfactual Estimation> in PyTorch.β17Nov 11, 2021Updated 4 years ago
- Measuring if attention is explanation with ROARβ22Mar 3, 2023Updated 3 years ago
- The offical code for paper "What Constitutes a Faithful Summary? Preserving Author Perspectives in News Summarization"β10Jun 23, 2024Updated last year
- β32Nov 16, 2025Updated 5 months ago
- β12Jan 20, 2025Updated last year
- ScienceMeter: Tracking Scientific Knowledge Updates in Language Modelsβ17Jun 28, 2025Updated 10 months ago
- A tool for calling (and calling out to) large language models.β16Aug 13, 2024Updated last year
- Simple phoenix setup for padded window managementβ13Apr 25, 2018Updated 8 years ago
- Mental state inference from observable behaviorβ15Dec 3, 2021Updated 4 years ago
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Code used to run experiments for the ICLR 2023 paper "Computational Language Acquisition with Theory of Mind".β15Apr 27, 2023Updated 3 years ago
- The example of correspondence between fine classes and superclasses (coarse classes) in ImageNet.β13Dec 4, 2024Updated last year
- Code listing for the paper 'SATAR: A Self-supervised Approach to Twitter Account Representation Learning and its Application in Bot Detecβ¦β10Nov 1, 2021Updated 4 years ago
- Code and instructions accompanying ICCV'23 paper Protoype-based Dataset Comparisonβ18Dec 15, 2023Updated 2 years ago
- Repository for the COLM 2025 paper SpecDec++: Boosting Speculative Decoding via Adaptive Candidate Lengthsβ18Jul 10, 2025Updated 9 months ago
- This repo contains the ToMnet+ model for preference inference. Developed by Yun-Shiuan, Edwinn, Hsin-Yi, and Elaine.β10Feb 24, 2023Updated 3 years ago
- Resolving Knowledge Conflicts in Large Language Models, COLM 2024β18Oct 7, 2025Updated 6 months ago
- β13Jul 25, 2023Updated 2 years ago
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found heβ¦β31Aug 25, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- β21Feb 10, 2025Updated last year
- This repository includes the code implementation of the paper Improving Pacing in Long-Form Story Planning by Yichen Wang, Kevin Yang, Xiβ¦β17Nov 19, 2024Updated last year
- [ACL'24] A Knowledge-grounded Interactive Evaluation Framework for Large Language Modelsβ40Jul 19, 2024Updated last year
- [ICML25] CODESYNC: Synchronizing Large Language Models with Dynamic Code Evolution at Scaleβ25Jul 31, 2025Updated 9 months ago
- This repository contains the joint use of CPO and SimPO method for better reference-free preference learning methods.β57Aug 13, 2024Updated last year
- β19Feb 3, 2022Updated 4 years ago
- β14Apr 16, 2024Updated 2 years ago
- β11Dec 22, 2021Updated 4 years ago
- Code accompanying ICML 2021 paper "Few-shot Language Coordination by Modeling Theory of Mind"β18May 18, 2022Updated 3 years ago
- GPUs on demand by Runpod - Special Offer Available β’ AdRun AI, ML, and HPC workloads on powerful cloud GPUsβwithout limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- [NeurIPS 2023] PyTorch code for Can Language Models Teach? Teacher Explanations Improve Student Performance via Theory of Mindβ66Dec 21, 2023Updated 2 years ago
- [AAAI-26] Are We on the Right Way for Assessing Document Retrieval-Augmented Generation?β29Dec 14, 2025Updated 4 months ago
- EMNLP 2022: Analyzing and Evaluating Faithfulness in Dialogue Summarizationβ13Mar 20, 2025Updated last year
- ToMATO: Verbalizing the Mental States of Role-Playing LLMs for Benchmarking Theory of Mind (AAAI2025)β20Apr 16, 2025Updated last year
- BIC: Twitter Bot Detection with Text-Graph Interaction and Semantic Consistencyβ16Nov 11, 2024Updated last year
- MetricEval: A framework that conceptualizes and operationalizes four main components of metric evaluation, in terms of reliability and vaβ¦β12Nov 6, 2023Updated 2 years ago
- β25May 16, 2024Updated last year