amazon-science / llm-hallucinations-factual-qaLinks
☆12Updated 6 months ago
Alternatives and similar repositories for llm-hallucinations-factual-qa
Users that are interested in llm-hallucinations-factual-qa are comparing it to the libraries listed below
Sorting:
- Astraios: Parameter-Efficient Instruction Tuning Code Language Models☆59Updated last year
- XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts☆34Updated last year
- CoNLI: a plug-and-play framework for ungrounded hallucination detection and reduction☆31Updated last year
- NeurIPS'24 - LLM Safety Landscape☆26Updated 5 months ago
- Implementation and datasets for "Training Language Models to Generate Quality Code with Program Analysis Feedback"☆25Updated 3 weeks ago
- Training and Benchmarking LLMs for Code Preference.☆34Updated 9 months ago
- ☆31Updated 2 years ago
- TrustAgent: Towards Safe and Trustworthy LLM-based Agents☆50Updated 6 months ago
- For our ACL25 Paper: Can Language Models Replace Programmers? RepoCod Says ‘Not Yet’ - by Shanchao Liang and Yiran Hu and Nan Jiang and L…☆22Updated this week
- A novel approach to improve the safety of large language models, enabling them to transition effectively from unsafe to safe state.☆63Updated 2 months ago
- [ICLR'24 Spotlight] A language model (LM)-based emulation framework for identifying the risks of LM agents with tool use☆157Updated last year
- ☆44Updated 6 months ago
- Official repository for the paper "ALERT: A Comprehensive Benchmark for Assessing Large Language Models’ Safety through Red Teaming"☆44Updated 10 months ago
- awesome-LLM-controlled-constrained-generation☆49Updated 11 months ago
- [ACL'2025 Findings] Official repo for "HumanEval Pro and MBPP Pro: Evaluating Large Language Models on Self-invoking Code Generation Task…☆28Updated 4 months ago
- Code for watermarking language models☆80Updated 11 months ago
- Implementation of paper 'Defending Large Language Models against Jailbreak Attacks via Semantic Smoothing'☆19Updated last year
- Safe Unlearning: A Surprisingly Effective and Generalizable Solution to Defend Against Jailbreak Attacks☆29Updated last year
- The official repository of the paper "On the Exploitability of Instruction Tuning".☆64Updated last year
- A Lightweight Visual Reasoning Benchmark for Evaluating Large Multimodal Models through Complex Diagrams in Coding Tasks☆12Updated 5 months ago
- Repo for the research paper "SecAlign: Defending Against Prompt Injection with Preference Optimization"☆64Updated 3 weeks ago
- [NAACL 2025] A Closer Look into Mixture-of-Experts in Large Language Models☆52Updated 6 months ago
- [TACL] Code for "Red Teaming Language Model Detectors with Language Models"☆23Updated last year
- Code and Results of the Paper: On the Resilience of Multi-Agent Systems with Malicious Agents☆25Updated 6 months ago
- [ICLR'24] RAIN: Your Language Models Can Align Themselves without Finetuning☆96Updated last year
- code repo for ICLR 2024 paper "Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs"☆126Updated last year
- [NeurIPS'23] Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors☆78Updated 7 months ago
- An official implementation of "Catastrophic Failure of LLM Unlearning via Quantization" (ICLR 2025)☆29Updated 5 months ago
- A curated list of explainability-related papers, articles, and resources focused on Large Language Models (LLMs). This repository aims to…☆37Updated last month
- [NeurIPS 2024] Goldfish Loss: Mitigating Memorization in Generative LLMs☆91Updated 8 months ago