Data-Provenance-Initiative / Data-Provenance-Collection
☆199Updated this week
Related projects ⓘ
Alternatives and complementary repositories for Data-Provenance-Collection
- Code accompanying "How I learned to start worrying about prompt formatting".☆95Updated last month
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆122Updated 8 months ago
- ☆258Updated this week
- A toolkit for describing model features and intervening on those features to steer behavior.☆99Updated last week
- datasets from the paper "Towards Understanding Sycophancy in Language Models"☆62Updated last year
- ☆451Updated 3 weeks ago
- awesome synthetic (text) datasets☆242Updated 3 weeks ago
- Evaluating LLMs with fewer examples☆134Updated 7 months ago
- MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents [EMNLP 2024]☆103Updated last month
- What's In My Big Data (WIMBD) - a toolkit for analyzing large text datasets☆190Updated this week
- Website for hosting the Open Foundation Models Cheat Sheet.☆257Updated 4 months ago
- Functional Benchmarks and the Reasoning Gap☆78Updated last month
- visualizing attention for LLM users☆163Updated last year
- The GitHub repo for Goal Driven Discovery of Distributional Differences via Language Descriptions☆68Updated last year
- Benchmarking LLMs with Challenging Tasks from Real Users☆195Updated 2 weeks ago
- Scaling Data-Constrained Language Models☆321Updated last month
- Manage scalable open LLM inference endpoints in Slurm clusters☆236Updated 4 months ago
- The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"☆109Updated last year
- Pretraining Efficiently on S2ORC!☆136Updated 3 weeks ago
- Steering vectors for transformer language models in Pytorch / Huggingface☆65Updated last month
- ☆112Updated last month
- Evaluating LLMs with CommonGen-Lite☆85Updated 8 months ago
- Code and results accompanying the paper "Refusal in Language Models Is Mediated by a Single Direction".☆123Updated last month
- The official evaluation suite and dynamic data release for MixEval.☆224Updated last week
- Improving Alignment and Robustness with Circuit Breakers☆154Updated last month
- This is the repo for the paper Shepherd -- A Critic for Language Model Generation☆213Updated last year
- Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"☆293Updated 11 months ago
- Codebase accompanying the Summary of a Haystack paper.☆72Updated 2 months ago
- ☆101Updated 3 months ago
- Datasets collection and preprocessings framework for NLP extreme multitask learning☆149Updated 4 months ago