zjysteven / mink-plus-plus
Min-K%++: Improved baseline for detecting pre-training data of LLMs https://arxiv.org/abs/2404.02936
☆26Updated 5 months ago
Related projects ⓘ
Alternatives and complementary repositories for mink-plus-plus
- ☆33Updated last year
- Repo accompanying our paper "Do Llamas Work in English? On the Latent Language of Multilingual Transformers".☆58Updated 8 months ago
- A curated list of awesome resources dedicated to Scaling Laws for LLMs☆63Updated last year
- ☆49Updated last year
- EMNLP 2024: Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue☆33Updated this week
- Official Repository for Dataset Inference for LLMs☆23Updated 3 months ago
- Grade-School Math with Irrelevant Context (GSM-IC) benchmark is an arithmetic reasoning dataset built upon GSM8K, by adding irrelevant se…☆55Updated last year
- Official implementation of Privacy Implications of Retrieval-Based Language Models (EMNLP 2023). https://arxiv.org/abs/2305.14888☆36Updated 5 months ago
- Lightweight tool to identify Data Contamination in LLMs evaluation☆42Updated 8 months ago
- Röttger et al. (2023): "XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models"☆63Updated 10 months ago
- Restore safety in fine-tuned language models through task arithmetic☆26Updated 7 months ago
- This repository contains data, code and models for contextual noncompliance.☆18Updated 4 months ago
- Official repository for MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models [NeurIPS 2024]☆49Updated last week
- ☆38Updated last year
- A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity.☆57Updated 2 weeks ago
- [NeurIPS 2024] Knowledge Circuits in Pretrained Transformers☆75Updated last month
- [ICLR'24] RAIN: Your Language Models Can Align Themselves without Finetuning☆84Updated 5 months ago
- Methods and evaluation for aligning language models temporally☆24Updated 8 months ago
- Is In-Context Learning Sufficient for Instruction Following in LLMs?☆25Updated 5 months ago
- Official code for the paper: Evaluating Copyright Takedown Methods for Language Models☆15Updated 4 months ago
- An open-source library for contamination detection in NLP datasets and Large Language Models (LLMs).☆43Updated 3 months ago
- ☆44Updated 2 months ago
- ☆39Updated last year
- Github repository for "FELM: Benchmarking Factuality Evaluation of Large Language Models" (NeurIPS 2023)☆54Updated 10 months ago
- ☆71Updated 3 months ago
- Codebase for decoding compressed trust.☆20Updated 6 months ago
- ☆26Updated 3 weeks ago
- ☆24Updated 6 months ago
- ☆39Updated 7 months ago
- [NeurIPS 2023 D&B Track] Code and data for paper "Revisiting Out-of-distribution Robustness in NLP: Benchmarks, Analysis, and LLMs Evalua…☆29Updated last year