BCG-X-Official / artkitLinks
Automated prompt-based testing and evaluation of Gen AI applications
☆150Updated 4 months ago
Alternatives and similar repositories for artkit
Users that are interested in artkit are comparing it to the libraries listed below
Sorting:
- Fiddler Auditor is a tool to evaluate language models.☆184Updated last year
- An open-source compliance-centered evaluation framework for Generative AI models☆158Updated last week
- Practical examples of "Flawed Machine Learning Security" together with ML Security best practice across the end to end stages of the mach…☆112Updated 3 years ago
- A tool for evaluating LLMs☆423Updated last year
- AI Verify☆23Updated this week
- A catalog of design patterns when building generative AI applications☆164Updated last week
- ☆53Updated last year
- Synthetic Data SDK ✨☆608Updated this week
- Sample notebooks and prompts for LLM evaluation☆135Updated last month
- Stanford CRFM's initiative to assess potential compliance with the draft EU AI Act☆94Updated last year
- Build MLOps Pipelines in Minutes☆246Updated last week
- LangFair is a Python library for conducting use-case level LLM bias and fairness assessments☆219Updated this week
- LLM Comparator is an interactive data visualization tool for evaluating and analyzing LLM responses side-by-side, developed by the PAIR t…☆454Updated 5 months ago
- Deliver safe & effective language models☆529Updated this week
- A Lightweight Library for AI Observability☆246Updated 4 months ago
- Automated knowledge graph creation SDK☆122Updated 7 months ago
- Moonshot - A simple and modular tool to evaluate and red-team any LLM application.☆258Updated this week
- wandbot is a technical support bot for Weights & Biases' AI developer tools that can run in Discord, Slack, ChatGPT and Zendesk☆302Updated last week
- A small library of LLM judges☆232Updated 3 weeks ago
- A comprehensive guide to LLM evaluation methods designed to assist in identifying the most suitable evaluation techniques for various use…☆123Updated last week
- Metafeature Extraction for Unstructured Data☆102Updated 4 months ago
- Product analytics for AI Assistants☆154Updated last month
- GenAIOps on Kubernetes: A collection of reference architectures for running GenAI at scale on Kubernetes using OSS tooling☆130Updated 8 months ago
- Initiative to evaluate and rank the most popular LLMs across common task types based on their propensity to hallucinate.☆111Updated 10 months ago
- An API for Chat Fine-Tuned Large Language Models (llm)☆87Updated 11 months ago
- Examples for TruEra users to get started!☆26Updated last year
- Simple, Pythonic building blocks to evaluate LLM applications.☆230Updated last week
- This open-source repository offers reference code for integrating workplace datastores with Cohere's LLMs, enabling developers and busine…☆151Updated 9 months ago
- The fastest Trust Layer for AI Agents☆138Updated last month
- A curated list of awesome synthetic data tools (open source and commercial).☆192Updated last year