Data-Provenance-Initiative / Data-Provenance-CollectionLinks

☆256

Alternatives and similar repositories for Data-Provenance-Collection

Users that are interested in Data-Provenance-Collection are comparing it to the libraries listed below

Sorting:

stanford-crfm / ecosystem-graphs
☆268Updated 9 months ago
allenai / wimbd
What's In My Big Data (WIMBD) - a toolkit for analyzing large text datasets
☆223Updated last year
davanstrien / awesome-synthetic-datasets
awesome synthetic (text) datasets
☆305Updated this week
allenai / fm-cheatsheet
Website for hosting the Open Foundation Models Cheat Sheet.
☆268Updated 6 months ago
huggingface / data-is-better-together
Let's build better datasets, together!
☆264Updated 10 months ago
felipemaiapolo / tinyBenchmarks
Evaluating LLMs with fewer examples
☆168Updated last year
msclar / formatspread
Code accompanying "How I learned to start worrying about prompt formatting".
☆110Updated 5 months ago
rosewang2008 / bridge
NAACL 2024. Code & Dataset for "🌁 Bridging the Novice-Expert Gap via Models of Decision-Making: A Case Study on Remediating Math Mistake…
☆44Updated last year
meg-tong / sycophancy-eval
datasets from the paper "Towards Understanding Sycophancy in Language Models"
☆96Updated 2 years ago
ConsequentAI / fneval
Functional Benchmarks and the Reasoning Gap
☆89Updated last year
sileod / tasksource
Datasets collection and preprocessings framework for NLP extreme multitask learning
☆188Updated 4 months ago
cohere-ai / magikarp
Code for the paper "Fishing for Magikarp"
☆172Updated 6 months ago
huggingface / llm-swarm
Manage scalable open LLM inference endpoints in Slurm clusters
☆276Updated last year
chaitanyamalaviya / ExpertQA
[Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers
☆135Updated last year
TransluceAI / observatory
A toolkit for describing model features and intervening on those features to steer behavior.
☆214Updated last year
lukasberglund / reversal_curse
☆297Updated 2 years ago
JinjieNi / MixEval
The official evaluation suite and dynamic data release for MixEval.
☆252Updated last year
EleutherAI / concept-erasure
Erasing concepts from neural representations with provable guarantees
☆238Updated 9 months ago
IBM / unitxt
🦄 Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the world's largest catalog of tools and data …
☆211Updated last week
kaistAI / FLASK
[ICLR 2024 Spotlight] FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets
☆217Updated last year
minalee-research / coauthor-interface
☆100Updated last year
PAIR-code / interpretability
PAIR.withgoogle.com and friend's work on interpretability methods
☆210Updated last month
EleutherAI / elk
Keeping language models honest by directly eliciting knowledge encoded in their activations.
☆212Updated last week
vinid / NegotiationArena
☆79Updated last year
allenai / peS2o
Pretraining Efficiently on S2ORC!
☆172Updated last year
stanford-crfm / fmti
The Foundation Model Transparency Index
☆83Updated last year
tatsu-lab / opinions_qa
☆116Updated last year
princeton-nlp / LitSearch
[EMNLP 2024] A Retrieval Benchmark for Scientific Literature Search
☆101Updated 11 months ago
allenai / catwalk
This project studies the performance and robustness of language models and task-adaptation methods.
☆154Updated last year
huggingface / datablations
Scaling Data-Constrained Language Models
☆341Updated 4 months ago