[NeurIPS 2024] HonestLLM: Toward an Honest and Helpful Large Language Model
☆29Jun 10, 2025Updated 11 months ago
Alternatives and similar repositories for HonestyLLM
Users that are interested in HonestyLLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- NeurIPS 2025 Poster☆26Feb 4, 2025Updated last year
- [ICLR'25] DataGen: Unified Synthetic Dataset Generation via Large Language Models☆67Mar 8, 2025Updated last year
- [ICML 2024 Oral] Official code repository for MLLM-as-a-Judge.☆93Feb 17, 2025Updated last year
- Can We Trust Large Language Models?: A Benchmark for Responsible Large Language Models via Toxicity, Bias, and Value-alignment Evaluation☆26Oct 12, 2023Updated 2 years ago
- (ICLR 2025) The Official Code Repository for GUI-World.☆69Dec 18, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- [ICML 2024] TrustLLM: Trustworthiness in Large Language Models☆627Jun 24, 2025Updated 11 months ago
- [ICLR'26, NAACL'25 Demo] Toolkit & Benchmark for evaluating the trustworthiness of generative foundation models.☆130Aug 22, 2025Updated 9 months ago
- [ACL 2025] Cross-Lingual Pitfalls: Automatic Probing Cross-Lingual Weakness of Multilingual Large Language Models☆42May 29, 2025Updated 11 months ago
- (ICLR 2025 Spotlight) Official code repository for Interleaved Scene Graph.☆31Aug 7, 2025Updated 9 months ago
- Official PyTorch implementation of Rethinking Guidance Information to Utilize Unlabeled Samples: A Label-Encoding Perspective.☆19Sep 27, 2024Updated last year
- [ICLR'24] MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use☆115Mar 21, 2024Updated 2 years ago
- Official Implementation of "Learning to Refuse: Towards Mitigating Privacy Risks in LLMs"☆10Dec 13, 2024Updated last year
- Official github repo for AutoDetect, an automated weakness detection framework for LLMs.☆46Jun 25, 2024Updated last year
- Graph neural network for predicting energy of known and hypothetical crystal structures☆10Jan 26, 2022Updated 4 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Documentation at☆14Mar 27, 2025Updated last year
- Project of ACL 2025 "UAlign: Leveraging Uncertainty Estimations for Factuality Alignment on Large Language Models"☆14Mar 25, 2025Updated last year
- AbstainQA, ACL 2024☆29Feb 4, 2026Updated 3 months ago
- Diffusion Probabilistic Model in Jax☆13Apr 20, 2024Updated 2 years ago
- Toolkit for foundation models in causal inference☆33Jan 14, 2026Updated 4 months ago
- [NeurIPS 2024] A task generation and model evaluation system for multimodal language models.☆72Nov 27, 2024Updated last year
- This repository contains two datasets with multi-turn adversarial conversations generated by human agents interacting with a dialog model…☆35Jul 16, 2024Updated last year
- Beyond Empathy: Integrating Diagnostic and Therapeutic Reasoning with Large Language Models for Mental Health Counseling☆45Apr 19, 2026Updated last month
- Look Back to Reason Forward: Revisitable Memory for Long-Context LLM Agents☆38Apr 13, 2026Updated last month
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆12Mar 25, 2023Updated 3 years ago
- Investigating and Defending Shortcut Learning in Personalized Diffusion Models☆14Nov 19, 2024Updated last year
- [ICLR 2025] This repository contains the code to reproduce the results from our paper From Sparse Dependence to Sparse Attention: Unveili…☆12Mar 7, 2025Updated last year
- ☆12May 13, 2023Updated 3 years ago
- Draw ALL Your Imagine: A Holistic Benchmark and Agent Framework for Complex Instruction-based Image Generation☆23Sep 24, 2025Updated 8 months ago
- [ICML25] CODESYNC: Synchronizing Large Language Models with Dynamic Code Evolution at Scale☆25Jul 31, 2025Updated 9 months ago
- [ICME 2024 Oral] DARA: Domain- and Relation-aware Adapters Make Parameter-efficient Tuning for Visual Grounding☆23Feb 26, 2025Updated last year
- This repository contains data and analysis scripts to reproduce the figures as well as source code and simulation scripts to perform the …☆13Apr 13, 2021Updated 5 years ago
- Python Wrapper of visqol☆11Dec 23, 2024Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Modified Fetch Robotics environments from OpenAI gym☆11Nov 27, 2021Updated 4 years ago
- ☆10Jan 19, 2026Updated 4 months ago
- [CCS 2024] Optimization-based Prompt Injection Attack to LLM-as-a-Judge☆40Sep 17, 2025Updated 8 months ago
- Source code for our paper: "ARIA: Training Language Agents with Intention-Driven Reward Aggregation".☆30Aug 9, 2025Updated 9 months ago
- [AAAI-26] Are We on the Right Way for Assessing Document Retrieval-Augmented Generation?☆30Dec 14, 2025Updated 5 months ago
- This repository contains a PyTorch implementation of the ICSE'26 paper "Scrub It Out! Erasing Sensitive Memorization in Code Language Mod…☆30Sep 18, 2025Updated 8 months ago
- [ICLR 2024 Oral] Improving Convergence and Generalization Using Parameter Symmetries☆31May 29, 2024Updated last year