Codes and datasets for the paper Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse
☆72Mar 3, 2025Updated last year
Alternatives and similar repositories for trust-align
Users that are interested in trust-align are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Ferret: Faster and Effective Automated Red Teaming with Reward-Based Scoring Technique☆18Aug 22, 2024Updated last year
- Our EMNLP 2022 paper on MCQA☆23Jan 15, 2023Updated 3 years ago
- Our EMNLP 2022 paper on VIP-Based Prompting for Parameter-Efficient Learning☆10Oct 22, 2022Updated 3 years ago
- This repository is maintained to release dataset and models for multimodal puzzle reasoning.☆113Feb 26, 2025Updated last year
- Test LLMs against jailbreaks and unprecedented harms☆40Oct 19, 2024Updated last year
- The implementation of RAGSynth: Synthetic Data for Robust and Faithful RAG Component Optimization☆21May 26, 2025Updated 9 months ago
- [NeurIPS 2024] "Self-Calibrated Tuning of Vision-Language Models for Out-of-Distribution Detection"☆13Oct 28, 2024Updated last year
- Code repository for the paper on "Predicting the Performance of Black-Box LLMs through Self-Queries".☆12Jan 9, 2025Updated last year
- ☆22Mar 16, 2023Updated 3 years ago
- ☆39Apr 15, 2024Updated last year
- Restore safety in fine-tuned language models through task arithmetic