☆32Jul 11, 2024Updated last year
Alternatives and similar repositories for AutoBencher
Users that are interested in AutoBencher are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for "FactKB: Generalizable Factuality Evaluation using Language Models Enhanced with Factual Knowledge". EMNLP 2023.☆20Dec 25, 2023Updated 2 years ago
- Implementation of the paper "FactGraph: Evaluating Factuality in Summarization with Semantic Graph Representations (NAACL 2022)"☆51Jul 26, 2023Updated 2 years ago
- ☆13May 17, 2025Updated last year
- ☆23Mar 8, 2024Updated 2 years ago
- The implementation of <Factual Consistency Evaluation for Text Summarization via Counterfactual Estimation> in PyTorch.☆17Nov 11, 2021Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Measuring if attention is explanation with ROAR☆22Mar 3, 2023Updated 3 years ago
- [ICLR'26, NAACL'25 Demo] Toolkit & Benchmark for evaluating the trustworthiness of generative foundation models.☆130Aug 22, 2025Updated 9 months ago
- ☆33Nov 16, 2025Updated 6 months ago
- ☆14Oct 7, 2024Updated last year
- ScienceMeter: Tracking Scientific Knowledge Updates in Language Models☆17Jun 28, 2025Updated 10 months ago
- Simple phoenix setup for padded window management☆13Apr 25, 2018Updated 8 years ago
- Mental state inference from observable behavior☆15Dec 3, 2021Updated 4 years ago
- Code used to run experiments for the ICLR 2023 paper "Computational Language Acquisition with Theory of Mind".☆15Apr 27, 2023Updated 3 years ago
- The example of correspondence between fine classes and superclasses (coarse classes) in ImageNet.☆13Dec 4, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆41Jan 26, 2025Updated last year
- Code listing for the paper 'SATAR: A Self-supervised Approach to Twitter Account Representation Learning and its Application in Bot Detec…☆10Nov 1, 2021Updated 4 years ago
- Code and instructions accompanying ICCV'23 paper Protoype-based Dataset Comparison☆18Dec 15, 2023Updated 2 years ago
- ☆16Dec 14, 2023Updated 2 years ago
- Check your grade automatically and send e-mail when new grade comes☆12Feb 7, 2018Updated 8 years ago
- Repository for the COLM 2025 paper SpecDec++: Boosting Speculative Decoding via Adaptive Candidate Lengths☆19Jul 10, 2025Updated 10 months ago
- Code for Columbia University COMS 3997 – LLM Ethics and Foundations☆15Jan 7, 2025Updated last year
- Code for the paper "Symmetric Machine Theory of Mind", presented at ICML 2022.☆12Jul 18, 2022Updated 3 years ago
- This repo contains the ToMnet+ model for preference inference. Developed by Yun-Shiuan, Edwinn, Hsin-Yi, and Elaine.☆10Feb 24, 2023Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Resolving Knowledge Conflicts in Large Language Models, COLM 2024☆18Oct 7, 2025Updated 7 months ago
- CopyBench: Measuring Literal and Non-Literal Reproduction of Copyright-Protected Text in Language Model Generation☆14Aug 19, 2025Updated 9 months ago
- ☆13Jul 25, 2023Updated 2 years ago
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…☆31Aug 25, 2023Updated 2 years ago
- ☆21Feb 10, 2025Updated last year
- ☆11Jan 3, 2024Updated 2 years ago
- Find context neurons in Pythia models.☆13Jun 13, 2023Updated 2 years ago
- ☆12May 13, 2023Updated 3 years ago
- [ACL'24] A Knowledge-grounded Interactive Evaluation Framework for Large Language Models☆40Jul 19, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [ICML25] CODESYNC: Synchronizing Large Language Models with Dynamic Code Evolution at Scale☆25Jul 31, 2025Updated 9 months ago
- Repository for the ACL 2024 conference website☆18Feb 3, 2025Updated last year
- This repository contains the joint use of CPO and SimPO method for better reference-free preference learning methods.☆57Aug 13, 2024Updated last year
- Framework for unified summarisation and evaluation of English documents using state-of-the-art models and measures.☆33May 13, 2024Updated 2 years ago
- ☆11Dec 22, 2021Updated 4 years ago
- Kubernetes Tutorial for the PS2 group meetings at UC Berkeley☆16Mar 23, 2023Updated 3 years ago
- ☆23Jan 25, 2023Updated 3 years ago