☆47Mar 20, 2023Updated 3 years ago
Alternatives and similar repositories for benchmark_llm_summarization
Users that are interested in benchmark_llm_summarization are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Perform facts checks on your conversations with LLMs to catch fake-news, misleading information, and LLMs confusion.☆10Apr 22, 2023Updated 3 years ago
- Adaptive-binning for evaluation of confidence calibration☆12Jul 28, 2019Updated 6 years ago
- ☆23Feb 26, 2024Updated 2 years ago
- Implementation of "Can we obtain significant success in RST discourse parsing by using Large Language Models?" (accepted by EACL 2024)☆20May 13, 2024Updated 2 years ago
- Code for paper Towards Mitigating LLM Hallucination via Self Reflection☆30Oct 9, 2023Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Python implementation of the local outlier factor tuning algorithm described in “Automatic Hyperparameter Tuning Method for Local Outlier…☆11Aug 3, 2020Updated 5 years ago
- The official implement of paper 《DaMo: Data Mixing Optimizer in Fine-tuning Multimodal LLMs for Mobile Phone Agents》☆31Oct 23, 2025Updated 7 months ago
- A package dedicated for running benchmark agreement testing☆18Sep 18, 2025Updated 8 months ago
- ☆26Nov 21, 2022Updated 3 years ago
- Find informative examples to efficiently (human)-evaluate NLG models.☆17Apr 22, 2026Updated last month
- A simple algorithm to identify and correct for label shift.☆21Feb 4, 2018Updated 8 years ago
- Python package for evaluating model calibration in classification☆20Nov 12, 2019Updated 6 years ago
- Bayesian IRT submodule for GIRTH☆19Nov 11, 2021Updated 4 years ago
- Synthetic data generation for evaluating LLM symbolic and logic reasoning☆22Mar 6, 2026Updated 2 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- this is based on the paper Chain-of-Retrieval Augmented Generation☆15Mar 29, 2025Updated last year
- An extension of the sigma standard to include security metrics.☆16May 18, 2023Updated 3 years ago
- Codes for "Benchmarking the Generation of Fact Checking Explanations"☆10Aug 16, 2024Updated last year
- Label shift experiments☆17Dec 3, 2020Updated 5 years ago
- Summarize CTI reports with OpenAI☆18May 19, 2026Updated last week
- ☆16Aug 15, 2025Updated 9 months ago
- Source code and data of our paper "Missing Counter-Evidence Renders NLP Fact-Checking Unrealistic for Misinformation" (https://arxiv.org/…☆10Jun 21, 2023Updated 2 years ago
- ☆11Jul 14, 2023Updated 2 years ago
- The large-scale MultiLingual SUMmarization corpus☆28May 26, 2022Updated 4 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆14Apr 29, 2025Updated last year
- ☆12Jun 7, 2025Updated 11 months ago
- Preprocessing scripts for ACE and ERE datasets☆15Jul 28, 2020Updated 5 years ago
- Original PyTorch Implementation for the EMNLP 2023 Paper "Beyond Detection: A Defend-and-Summarize Strategy for Robust and Interpretable …☆16Dec 14, 2023Updated 2 years ago
- Repo for "On Learning to Summarize with Large Language Models as References"☆43May 24, 2023Updated 3 years ago
- ☆26Nov 7, 2022Updated 3 years ago
- Code for "ParaGuide: Guided Diffusion Paraphrasers for Plug-and-Play Textual Style Transfer"☆16Jul 17, 2024Updated last year
- The offical code for paper "What Constitutes a Faithful Summary? Preserving Author Perspectives in News Summarization"☆10Jun 23, 2024Updated last year
- Welcome to Uncertainty Metrics! The goal of this library is to provide an easy-to-use interface for both measuring uncertainty across Goo…☆24Nov 18, 2020Updated 5 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Proof system for Fact Verification☆14Jun 7, 2022Updated 3 years ago
- Articles, White Papers, Technical Write-Ups and more authored by members of the GreySec community. Curated by staff, selected for excelle…☆28Aug 17, 2021Updated 4 years ago
- ☆11Sep 24, 2024Updated last year
- ☆21Mar 25, 2023Updated 3 years ago
- Codebase for "A Consistent and Differentiable Lp Canonical Calibration Error Estimator", published at NeurIPS 2022.☆16Mar 18, 2024Updated 2 years ago
- [NeurIPS 2025] RL Tango: Reinforcing Generator and Verifier Together for Language Reasoning☆55Oct 23, 2025Updated 7 months ago
- Dataset and baseline for Coling 2022 long paper (oral): "ConFiguRe: Exploring Discourse-level Chinese Figures of Speech"☆13Jul 27, 2023Updated 2 years ago