chrisvdweth / seleneLinks
An open, large-scale, interactive textbook.
☆50Updated last week
Alternatives and similar repositories for selene
Users that are interested in selene are comparing it to the libraries listed below
Sorting:
- Papers on fairness in NLP☆450Updated last year
- ☆156Updated 2 years ago
- ☆57Updated last year
- A resource repository for representation engineering in large language models☆141Updated last year
- ☆237Updated last year
- A reading list of up-to-date papers on NLP for Social Good.☆304Updated 2 years ago
- [NeurIPS D&B '25] The one-stop repository for large language model (LLM) unlearning. Supports TOFU, MUSE, WMDP, and many unlearning metho…☆430Updated last month
- A repo for open resources & information for people to succeed in PhD in CS & career in AI / NLP☆956Updated last year
- ACL 2022: An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models.☆151Updated 3 months ago
- ☆165Updated last year
- ☆57Updated 2 years ago
- ☆26Updated last year
- Materials for EACL2024 tutorial: Transformer-specific Interpretability☆60Updated last year
- ☆24Updated this week
- Official repository for our NeurIPS 2023 paper "Paraphrasing evades detectors of AI-generated text, but retrieval is an effective defense…☆181Updated 2 years ago
- A resource repository for machine unlearning in large language models☆509Updated 4 months ago
- A survey and reflection on the latest research breakthroughs in LLM-generated Text detection, including data, detectors, metrics, current…☆236Updated 11 months ago
- This repository contains the data and code introduced in the paper "CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Maske…☆127Updated last year
- ☆61Updated 4 months ago
- Training data extraction on GPT-2☆193Updated 2 years ago
- ☆20Updated 3 months ago
- UnQovering Stereotyping Biases via Underspecified Questions - EMNLP 2020 (Findings)☆21Updated 4 years ago
- Links to conference/journal publications in automated fact-checking (resources for the TACL22/EMNLP23 paper).☆538Updated 9 months ago
- DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and Diffusion Models (ICLR 2024)☆76Updated last year
- A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity.☆84Updated 8 months ago
- This repository collects all relevant resources about interpretability in LLMs☆384Updated last year
- [ICLR 2025] General-purpose activation steering library☆120Updated 2 months ago
- Influence Analysis and Estimation - Survey, Papers, and Taxonomy☆83Updated last year
- ☆28Updated last year
- Python package for measuring memorization in LLMs.☆173Updated 4 months ago