☆21Aug 19, 2024Updated last year
Alternatives and similar repositories for HalluDial
Users that are interested in HalluDial are comparing it to the libraries listed below
Sorting:
- [ACL 2024] ANAH & [NeurIPS 2024] ANAH-v2 & [ICLR 2025] Mask-DPO☆63Apr 30, 2025Updated 10 months ago
- ☆49Jan 7, 2024Updated 2 years ago
- ☆16Sep 27, 2023Updated 2 years ago
- Explore, Establish, Exploit: Red Teaming Language Models from Scratch☆13Jun 21, 2023Updated 2 years ago
- ☆14Oct 11, 2023Updated 2 years ago
- Source code for Truth-Aware Context Selection: Mitigating the Hallucinations of Large Language Models Being Misled by Untruthful Contexts☆17Sep 2, 2024Updated last year
- I-SHEEP: Iterative Self-enHancEmEnt Paradigm of LLMs through Self-Instruct and Self-Assessment☆17Jan 16, 2025Updated last year
- Code and data for the FACTOR paper☆53Nov 15, 2023Updated 2 years ago
- ☆17Dec 21, 2023Updated 2 years ago
- [IJCAI 2024] FactCHD: Benchmarking Fact-Conflicting Hallucination Detection☆90Apr 28, 2024Updated last year
- ☆22Feb 3, 2024Updated 2 years ago
- Code and dataset for the EMNLP 2024 paper: GoldCoin: Grounding Large Language Models in Privacy Laws via Contextual Integrity Theory☆48Sep 26, 2024Updated last year
- Flames is a highly adversarial benchmark in Chinese for LLM's harmlessness evaluation developed by Shanghai AI Lab and Fudan NLP Group.☆63May 21, 2024Updated last year
- ☆13Aug 26, 2024Updated last year
- codes for "Self-Checker: Plug-and-Play Modules for Fact-Checking with Large Language Models"☆12Feb 10, 2025Updated last year
- ☆13Oct 20, 2022Updated 3 years ago
- [NAACL 2025 Main Selected Oral] Repository for the paper: Prompt Compression for Large Language Models: A Survey☆36May 18, 2025Updated 10 months ago
- GAOGAO-Bench-Updates is a supplement to the GAOKAO-Bench, a dataset to evaluate large language models.☆39Jan 7, 2025Updated last year
- Competition of Mechanisms: Tracing How Language Models Handle Facts and Counterfactuals☆12May 24, 2024Updated last year
- n awesome&curated list of the advanced graph data-centric (i.e., graph sparsification, graph denoise, graph condensation) learning papers☆17Jun 9, 2025Updated 9 months ago
- ☆27Jun 5, 2023Updated 2 years ago
- Setu is a comprehensive pipeline designed to clean, filter, and deduplicate diverse data sources including Web, PDF, and Speech data. Bui…☆16May 17, 2024Updated last year
- ☆15Apr 22, 2024Updated last year
- A Flexible Framework for Comprehensive Multimodal Model Evaluation☆100Feb 2, 2026Updated last month
- Repository for the paper "Cognitive Mirage: A Review of Hallucinations in Large Language Models"☆49Oct 21, 2023Updated 2 years ago
- [ACL 2024] User-friendly evaluation framework: Eval Suite & Benchmarks: UHGEval, HaluEval, HalluQA, etc.☆180Jun 7, 2025Updated 9 months ago
- Implementation for "RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content"☆23Jul 28, 2024Updated last year
- Predictive models and analysis of cancer prognosis and drug response using primary tumor microbial abundances derived from WGS and RNA-se…☆18Nov 13, 2024Updated last year
- ☆15May 12, 2025Updated 10 months ago
- A simple pytorch implementation of baseline based-on CLIP for Image-text Matching.☆19May 25, 2023Updated 2 years ago
- ☆51Mar 2, 2024Updated 2 years ago
- ☆22Jan 5, 2024Updated 2 years ago
- LLM evaluation.☆16Nov 7, 2023Updated 2 years ago
- MCP Server für Deutsche Gesetzestexte☆44Dec 19, 2025Updated 3 months ago
- A toolbox for benchmarking trustworthiness of multimodal large language models (MultiTrust, NeurIPS 2024 Track Datasets and Benchmarks)☆174Jun 27, 2025Updated 8 months ago
- ☆14Oct 28, 2023Updated 2 years ago
- Token-level Reference-free Hallucination Detection☆97Jul 25, 2023Updated 2 years ago
- ICLR 2025☆31May 21, 2025Updated 9 months ago
- ☆15Aug 1, 2019Updated 6 years ago