stanford-crfm / EUAIActJune15
Stanford CRFM's initiative to assess potential compliance with the draft EU AI Act
☆93Updated last year
Alternatives and similar repositories for EUAIActJune15:
Users that are interested in EUAIActJune15 are comparing it to the libraries listed below
- The Foundation Model Transparency Index☆75Updated 8 months ago
- ☆262Updated 3 weeks ago
- Fiddler Auditor is a tool to evaluate language models.☆175Updated 11 months ago
- This is an open-source tool to assess and improve the trustworthiness of AI systems.☆87Updated this week
- An open-source compliance-centered evaluation framework for Generative AI models☆131Updated 2 months ago
- 📚 A curated list of papers & technical articles on AI Quality & Safety☆169Updated last year
- ☆77Updated 8 months ago
- A comprehensive guide to LLM evaluation methods designed to assist in identifying the most suitable evaluation techniques for various use…☆93Updated this week
- Credo AI Lens is a comprehensive assessment framework for AI systems. Lens standardizes model and data assessment, and acts as a central …☆46Updated 8 months ago
- Creating the tools and data sets necessary to evaluate vulnerabilities in LLMs.☆23Updated last week
- The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"☆108Updated last year
- WorkBench: a Benchmark Dataset for Agents in a Realistic Workplace Setting.☆37Updated 6 months ago
- Check for data drift between two OpenAI multi-turn chat jsonl files.☆37Updated 10 months ago
- 🤗 Disaggregators: Curated data labelers for in-depth analysis.☆65Updated 2 years ago
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆100Updated 10 months ago
- ☆76Updated 8 months ago
- Sample notebooks and prompts for LLM evaluation☆120Updated 2 months ago
- Building a chatbot powered with a RAG pipeline to read,summarize and quote the most relevant papers related to the user query.☆166Updated 9 months ago
- Truth Forest: Toward Multi-Scale Truthfulness in Large Language Models through Intervention without Tuning☆45Updated last year
- Command Line Interface for Hugging Face Inference Endpoints☆67Updated 10 months ago
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker☆106Updated last week
- ☆24Updated last year
- A curated list of awesome academic research, books, code of ethics, data sets, institutes, maturity models, newsletters, principles, podc…☆65Updated this week
- Mixing Language Models with Self-Verification and Meta-Verification☆101Updated 2 months ago
- Run safety benchmarks against AI models and view detailed reports showing how well they performed.☆79Updated this week
- ☆210Updated 3 weeks ago
- Contains random samples referenced in the paper "Sleeper Agents: Training Robustly Deceptive LLMs that Persist Through Safety Training".☆93Updated 11 months ago
- ☆76Updated 2 years ago
- A tool for evaluating LLMs☆402Updated 9 months ago
- A framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings.☆106Updated this week