stanford-crfm / fmti
The Foundation Model Transparency Index
☆71Updated 5 months ago
Related projects ⓘ
Alternatives and complementary repositories for fmti
- ☆199Updated this week
- 🤗 Disaggregators: Curated data labelers for in-depth analysis.☆65Updated last year
- ☆101Updated 3 months ago
- Vivaria is METR's tool for running evaluations and conducting agent elicitation research.☆63Updated this week
- This is the reproduction repository for my 🤗 Hugging Face blog post on synthetic data☆61Updated 9 months ago
- Your buddy in the (L)LM space.☆63Updated 2 months ago
- code for training & evaluating Contextual Document Embedding models☆117Updated this week
- The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"☆109Updated last year
- Scaling is a distributed training library and installable dependency designed to scale up neural networks, with a dedicated module for tr…☆49Updated 3 weeks ago
- Web UI & Backend for Data Annotations in Aya☆26Updated 8 months ago
- ☆66Updated 2 weeks ago
- ☆258Updated this week
- ☆93Updated last month
- Using open source LLMs to build synthetic datasets for direct preference optimization☆40Updated 8 months ago
- 📝 Reference-Free automatic summarization evaluation with potential hallucination detection☆98Updated 10 months ago
- ☆66Updated this week
- Codebase accompanying the Summary of a Haystack paper.☆72Updated 2 months ago
- Functional Benchmarks and the Reasoning Gap☆78Updated last month
- Truth Forest: Toward Multi-Scale Truthfulness in Large Language Models through Intervention without Tuning☆41Updated 11 months ago
- Stanford CRFM's initiative to assess potential compliance with the draft EU AI Act☆92Updated last year
- ☆91Updated last year
- Open Implementations of LLM Analyses☆94Updated last month
- Evaluating LLMs with fewer examples☆134Updated 7 months ago
- ☆48Updated 2 weeks ago
- 📚 A curated list of papers & technical articles on AI Quality & Safety☆161Updated last year
- A repository containing the code for translating popular LLM benchmarks to German.☆24Updated last year
- ☆129Updated 3 weeks ago
- The GitHub repo for Goal Driven Discovery of Distributional Differences via Language Descriptions☆68Updated last year
- ☆68Updated 3 months ago
- A toolkit for describing model features and intervening on those features to steer behavior.☆99Updated last week