stanford-crfm / fmti
The Foundation Model Transparency Index
β77Updated 10 months ago
Alternatives and similar repositories for fmti:
Users that are interested in fmti are comparing it to the libraries listed below
- β221Updated this week
- This is the reproduction repository for my π€ Hugging Face blog post on synthetic dataβ68Updated last year
- Stanford CRFM's initiative to assess potential compliance with the draft EU AI Actβ93Updated last year
- π€ Disaggregators: Curated data labelers for in-depth analysis.β65Updated 2 years ago
- codebase release for EMNLP2023 paper publicationβ19Updated last year
- β263Updated 2 months ago
- Your buddy in the (L)LM space.β63Updated 6 months ago
- Client interface to Cleanlab Studio and the Trustworthy Language Modelβ30Updated last month
- Resources related to EACL 2023 paper "SwitchPrompt: Learning Domain-Specific Gated Soft Prompts for Classification in Low-Resource Domainβ¦β52Updated last year
- The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"β108Updated last year
- π Reference-Free automatic summarization evaluation with potential hallucination detectionβ100Updated last year
- Mixing Language Models with Self-Verification and Meta-Verificationβ102Updated 3 months ago
- Public repository containing METR's DVC pipeline for eval data analysisβ33Updated this week
- β89Updated last month
- Functional Benchmarks and the Reasoning Gapβ84Updated 5 months ago
- π€ HuggingFace Inference Toolkit for Google Cloud Vertex AI (similar to SageMaker's Inference Toolkit, but for Vertex AI and unofficial)β17Updated last year
- β68Updated last year
- Simple replication of [ColBERT-v1](https://arxiv.org/abs/2004.12832).β80Updated last year
- β47Updated last year
- Erasing concepts from neural representations with provable guaranteesβ226Updated 2 months ago
- β67Updated 7 months ago
- Evaluating LLMs with CommonGen-Liteβ89Updated last year
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo rankerβ107Updated 2 weeks ago
- An introduction to LLM Samplingβ77Updated 3 months ago
- A framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings.β109Updated this week
- Truth Forest: Toward Multi-Scale Truthfulness in Large Language Models through Intervention without Tuningβ46Updated last year
- The NDIF server, which performs deep inference and serves nnsight requests remotelyβ21Updated last week
- β76Updated 9 months ago
- Command Line Interface for Hugging Face Inference Endpointsβ66Updated 11 months ago
- β48Updated last year