hy-zhao23 / Explainability-for-Large-Language-Models
☆108Updated 10 months ago
Related projects ⓘ
Alternatives and complementary repositories for Explainability-for-Large-Language-Models
- A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..☆174Updated last month
- LLM hallucination paper list☆292Updated 8 months ago
- A curated reading list for large language model (LLM) alignment. Take a look at our new survey "Large Language Model Alignment: A Survey"…☆71Updated last year
- A Survey on Data Selection for Language Models☆182Updated last month
- LLM Unlearning☆125Updated last year
- Awesome LLM Self-Consistency: a curated list of Self-consistency in Large Language Models☆76Updated 3 months ago
- A Survey of Attributions for Large Language Models☆167Updated 2 months ago
- [EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"☆86Updated 2 months ago
- A curated list of Large Language Model with RAG☆70Updated last year
- Official github repo for AutoDetect, an automated weakness detection framework for LLMs.☆38Updated 4 months ago
- This repository provides an original implementation of Detecting Pretraining Data from Large Language Models by *Weijia Shi, *Anirudh Aji…☆208Updated last year
- ☆81Updated last year
- [NeurIPS 2024] Knowledge Circuits in Pretrained Transformers☆75Updated last month
- Project for the paper entitled `Instruction Tuning for Large Language Models: A Survey`☆146Updated last month
- EMNLP'23 survey: a curation of awesome papers and resources on refreshing large language models (LLMs) without expensive retraining.☆125Updated 11 months ago
- Code associated with Tuning Language Models by Proxy (Liu et al., 2024)☆97Updated 7 months ago
- The repository for the survey paper <<Survey on Large Language Models Factuality: Knowledge, Retrieval and Domain-Specificity>>☆327Updated 6 months ago
- R-Judge: Benchmarking Safety Risk Awareness for LLM Agents (EMNLP Findings 2024)☆61Updated last month
- BeaverTails is a collection of datasets designed to facilitate research on safety alignment in large language models (LLMs).☆111Updated last year
- Official implementation for the paper "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models"☆428Updated 6 months ago
- The Paper List on Data Contamination for Large Language Models Evaluation.☆75Updated this week
- [ICLR 2024] Evaluating Large Language Models at Evaluating Instruction Following☆117Updated 4 months ago
- Data and Code for Program of Thoughts (TMLR 2023)☆243Updated 6 months ago
- augmented LLM with self reflection☆102Updated last year
- [NAACL 2024 Outstanding Paper] Source code for the NAACL 2024 paper entitled "R-Tuning: Instructing Large Language Models to Say 'I Don't…☆83Updated 4 months ago
- AI Alignment: A Comprehensive Survey☆128Updated last year
- A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. 🧮✨☆103Updated 6 months ago
- ☆190Updated 3 months ago
- 【ACL 2024】 SALAD benchmark & MD-Judge☆106Updated last month
- Repository for the paper "Cognitive Mirage: A Review of Hallucinations in Large Language Models"☆46Updated last year