BCG-X-Official / artkitLinks
Automated prompt-based testing and evaluation of Gen AI applications
☆153Updated 7 months ago
Alternatives and similar repositories for artkit
Users that are interested in artkit are comparing it to the libraries listed below
Sorting:
- Fiddler Auditor is a tool to evaluate language models.☆188Updated last year
- An open-source compliance-centered evaluation framework for Generative AI models☆168Updated this week
- Practical examples of "Flawed Machine Learning Security" together with ML Security best practice across the end to end stages of the mach…☆119Updated 3 years ago
- A tool for evaluating LLMs☆424Updated last year
- AI Verify☆35Updated last week
- A Lightweight Library for AI Observability☆251Updated 7 months ago
- A small library of LLM judges☆294Updated 2 months ago
- Stanford CRFM's initiative to assess potential compliance with the draft EU AI Act☆93Updated 2 years ago
- 🔍 LangKit: An open-source toolkit for monitoring Large Language Models (LLMs). 📚 Extracts signals from prompts & responses, ensuring sa…☆952Updated 10 months ago
- LLM Comparator is an interactive data visualization tool for evaluating and analyzing LLM responses side-by-side, developed by the PAIR t…☆487Updated 8 months ago
- Metafeature Extraction for Unstructured Data☆103Updated 7 months ago
- A catalog of design patterns when building generative AI applications☆197Updated last month
- ☆163Updated 8 months ago
- Simple, Pythonic building blocks to evaluate LLM applications.☆241Updated last week
- LangFair is a Python library for conducting use-case level LLM bias and fairness assessments☆236Updated last week
- A toolkit for detecting and protecting against vulnerabilities in Large Language Models (LLMs).☆149Updated last year
- Sample notebooks and prompts for LLM evaluation☆151Updated this week
- Framework for LLM evaluation, guardrails and security☆113Updated last year
- ☆54Updated last year
- ☆73Updated 11 months ago
- Build MLOps Pipelines in Minutes☆248Updated 2 months ago
- Open Source LLM toolkit to build trustworthy LLM applications. TigerArmor (AI safety), TigerRAG (embedding, RAG), TigerTune (fine-tuning)☆397Updated last year
- Synthetic Data SDK ✨☆657Updated last week
- wandbot is a technical support bot for Weights & Biases' AI developer tools that can run in Discord, Slack, ChatGPT and Zendesk☆310Updated last month
- Red-Teaming Language Models with DSPy☆219Updated 8 months ago
- ☆170Updated last year
- SUQL: Conversational Search over Structured and Unstructured Data with LLMs☆288Updated 2 months ago
- Plug-and-play, zero-shot document processing pipelines.☆107Updated this week
- This is an open-source tool to assess and improve the trustworthiness of AI systems.☆99Updated last month
- Metrics to evaluate the quality of responses of your Retrieval Augmented Generation (RAG) applications.☆318Updated 3 months ago