Giskard-AI / awesome-ai-safetyLinks
π A curated list of papers & technical articles on AI Quality & Safety
β184Updated 2 months ago
Alternatives and similar repositories for awesome-ai-safety
Users that are interested in awesome-ai-safety are comparing it to the libraries listed below
Sorting:
- Fiddler Auditor is a tool to evaluate language models.β183Updated last year
- Creating the tools and data sets necessary to evaluate vulnerabilities in LLMs.β24Updated 3 months ago
- π Datasets and models for instruction-tuningβ238Updated last year
- Sample notebooks and prompts for LLM evaluationβ134Updated 2 weeks ago
- β267Updated 5 months ago
- The Foundation Model Transparency Indexβ81Updated last year
- Red-Teaming Language Models with DSPyβ198Updated 4 months ago
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo rankerβ112Updated 2 weeks ago
- Stanford CRFM's initiative to assess potential compliance with the draft EU AI Actβ94Updated last year
- A tool for evaluating LLMsβ419Updated last year
- AI Data Management & Evaluation Platformβ215Updated last year
- Collection of evals for Inspect AIβ167Updated this week
- Run safety benchmarks against AI models and view detailed reports showing how well they performed.β94Updated this week
- Erasing concepts from neural representations with provable guaranteesβ228Updated 5 months ago
- Large Language Model (LLM) Inference API and Chatbotβ126Updated last year
- PromptInject is a framework that assembles prompts in a modular fashion to provide a quantitative analysis of the robustness of LLMs to aβ¦β381Updated last year
- Keeping language models honest by directly eliciting knowledge encoded in their activations.β207Updated 2 weeks ago
- A curated list of awesome academic research, books, code of ethics, data sets, institutes, maturity models, newsletters, principles, podcβ¦β75Updated this week
- Open Implementations of LLM Analysesβ104Updated 8 months ago
- πΈ Open-Source Evaluation & Testing for Computer Vision AI systemsβ29Updated 8 months ago
- Explore and interpret large embeddings in your browser with interactive visualization! πβ465Updated last year
- An open-source compliance-centered evaluation framework for Generative AI modelsβ153Updated this week
- Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [ICLR 2025]β316Updated 5 months ago
- data cleaning and curation for unstructured textβ327Updated 10 months ago
- [ICML 2024] Binoculars: Zero-Shot Detection of LLM-Generated Textβ284Updated last year
- Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vectoβ¦β239Updated 4 months ago
- Credo AI Lens is a comprehensive assessment framework for AI systems. Lens standardizes model and data assessment, and acts as a central β¦β47Updated last year
- Persuasive Jailbreaker: we can persuade LLMs to jailbreak them!β307Updated 8 months ago
- Ghostbuster: Detecting Text Ghostwritten by Large Language Models (NAACL 2024)β161Updated last year
- Improving Alignment and Robustness with Circuit Breakersβ214Updated 9 months ago