ibm-granite / granite-guardianLinks
The Granite Guardian models are designed to detect risks in prompts and responses.
☆88Updated 3 months ago
Alternatives and similar repositories for granite-guardian
Users that are interested in granite-guardian are comparing it to the libraries listed below
Sorting:
- A better way of testing, inspecting, and analyzing AI Agent traces.☆38Updated 3 weeks ago
- ☆259Updated 6 months ago
- The code for the paper ROUTERBENCH: A Benchmark for Multi-LLM Routing System☆124Updated last year
- Beating the GAIA benchmark with Transformers Agents. 🚀☆123Updated 4 months ago
- ☆69Updated 4 months ago
- ☆77Updated 7 months ago
- A framework for fine-tuning retrieval-augmented generation (RAG) systems.☆112Updated this week
- Source code for the collaborative reasoner research project at Meta FAIR.☆91Updated 2 months ago
- Code for our paper PAPILLON: PrivAcy Preservation from Internet-based and Local Language MOdel ENsembles☆35Updated last month
- Collection of evals for Inspect AI☆155Updated this week
- ☆39Updated 11 months ago
- ☆92Updated 3 weeks ago
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆173Updated 3 months ago
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆71Updated this week
- Official Repo for CRMArena and CRMArena-Pro☆92Updated last week
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆68Updated 3 months ago
- Official Code Repository for the paper "Distilling LLM Agent into Small Models with Retrieval and Code Tools"☆109Updated 3 weeks ago
- Simple examples using Argilla tools to build AI☆53Updated 7 months ago
- Lean implementation of various multi-agent LLM methods, including Iteration of Thought (IoT)☆115Updated 4 months ago
- ☆61Updated 3 weeks ago
- ⚖️ Awesome LLM Judges ⚖️☆105Updated last month
- Code repo for "Agent Instructs Large Language Models to be General Zero-Shot Reasoners"☆112Updated 9 months ago
- DSBench: How Far are Data Science Agents from Becoming Data Science Experts?☆55Updated 4 months ago
- Codebase accompanying the Summary of a Haystack paper.☆78Updated 9 months ago
- ☆96Updated 9 months ago
- ☆115Updated 4 months ago
- A framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings.☆156Updated this week
- Scaling Data for SWE-agents☆256Updated this week
- MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents [EMNLP 2024]☆166Updated 5 months ago
- Evaluating LLMs with CommonGen-Lite☆90Updated last year