epoch-research / data-stockLinks

Models for data stocks and training dataset sizes

☆18

Alternatives and similar repositories for data-stock

Users that are interested in data-stock are comparing it to the libraries listed below

Sorting:

epoch-research / training-cost-trends
☆16Updated 2 weeks ago
benpry / why-think-step-by-step
Code and data for the paper "Why think step by step? Reasoning emerges from the locality of experience"
☆61Updated 3 months ago
yale-nlp / SciArena
Analysis code for paper "SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks"
☆44Updated 2 weeks ago
centerforaisafety / emergent-values
Code for "Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs"
☆52Updated 4 months ago
allenai / infinigram-api
☆70Updated this week
ExtensityAI / benchmark
Evaluation of neuro-symbolic engines
☆38Updated 11 months ago
facebookresearch / matrix
Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…
☆73Updated this week
stanford-crfm / fmti
The Foundation Model Transparency Index
☆82Updated last year
alonsosilvaallende / knowledge-graph-generator
Knowledge Graph Generator app
☆31Updated last year
facebookresearch / ExploreToM
Code for ExploreTom
☆84Updated 3 weeks ago
google-deepmind / dangerous-capability-evaluations
☆55Updated 9 months ago
METR / eval-analysis-public
Public repository containing METR's DVC pipeline for eval data analysis
☆78Updated 3 months ago
microsoft / eureka-ml-insights
A framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings.
☆164Updated this week
goodfire-ai / r1-interpretability
Open source interpretability artefacts for R1.
☆154Updated 3 months ago
graphcore / Gradient-HuggingFace
Tasks and tutorials using Graphore's IPU with Hugging Face. Originally at https://github.com/gradient-ai/Graphcore-HuggingFace
☆16Updated last year
ThrunGroup / maptree
☆39Updated last year
benediktstroebl / agent-evals
☆22Updated last month
keyonvafa / world-model-evaluation
☆59Updated 8 months ago
facebookresearch / BigOBench
BigOBench assesses the capacity of Large Language Models (LLMs) to comprehend time-space computational complexity of input or generated c…
☆35Updated 3 months ago
google-deepmind / mishax
☆134Updated 3 months ago
poking-agents / modular-public
☆23Updated last month
graphcore / distributed-kge-poplar
The application is a end-user training and evaluation system for standard knowledge graph embedding models. It was developed to optimise …
☆18Updated last month
ConsequentAI / fneval
Functional Benchmarks and the Reasoning Gap
☆88Updated 9 months ago
METR / RE-Bench
☆92Updated 2 months ago
bethgelab / CiteME
CiteME is a benchmark designed to test the abilities of language models in finding papers that are cited in scientific texts.
☆48Updated 8 months ago
evanatyourservice / llm-jax
Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.
☆17Updated 4 months ago
facebookresearch / collaborative-reasoner
Source code for the collaborative reasoner research project at Meta FAIR.
☆95Updated 3 months ago
microsoft / stop
Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation
☆44Updated last year
likenneth / q_probe
Q-Probe: A Lightweight Approach to Reward Maximization for Language Models
☆41Updated last year
guidance-ai / jsonschemabench
☆47Updated last month