epoch-research / data-stockLinks
Models for data stocks and training dataset sizes
☆18Updated last year
Alternatives and similar repositories for data-stock
Users that are interested in data-stock are comparing it to the libraries listed below
Sorting:
- Public repository containing METR's DVC pipeline for eval data analysis☆138Updated 7 months ago
- Fluid Language Model Benchmarking☆22Updated 2 months ago
- ☆20Updated last week
- Open Source Replication of Anthropic's Alignment Faking Paper☆51Updated 7 months ago
- ☆104Updated 3 months ago
- Learning to route instances for Human vs AI Feedback (ACL Main '25)☆26Updated 4 months ago
- Code for "Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs"☆83Updated 9 months ago
- Code and data for the paper "Why think step by step? Reasoning emerges from the locality of experience"☆62Updated 7 months ago
- Analysis code for Neurips 2025 paper "SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks"☆55Updated 3 months ago
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆106Updated this week
- 🤝 The code for "Can Large Language Model Agents Simulate Human Trust Behaviors?"☆102Updated 7 months ago
- OLMost every training recipe you need to perform data interventions with the OLMo family of models.☆56Updated this week
- UQ: Assessing Language Models on Unsolved Questions☆28Updated 3 months ago
- ☆53Updated last year
- Official repository of the 2025 paper, LLM Economist: Large Population Models and Mechanism Design in Multi-Agent Generative Simulacra.☆52Updated 4 months ago
- ☆87Updated this week
- ☆41Updated last year
- [ICLR'25] ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery☆109Updated 3 months ago
- Systematic evaluation framework that automatically rates overthinking behavior in large language models.☆94Updated 6 months ago
- ☆25Updated 6 months ago
- ☆43Updated last year
- ☆55Updated last year
- Official repo for Learning to Reason for Long-Form Story Generation☆72Updated 7 months ago
- Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation☆48Updated last year
- Discovering Data-driven Hypotheses in the Wild☆118Updated 5 months ago
- A virtual environment for developing and evaluating automated scientific discovery agents.☆191Updated 8 months ago
- Forecasting high-impact research topics via machine learning on evolving knowledge graphs☆44Updated 7 months ago
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆41Updated last year
- [EMNLP 2024 Findings] Benchmarking Language Model Agents for Data-Driven Science☆33Updated last year
- ☆62Updated 2 months ago