ibm-granite / granite-guardian
The Granite Guardian models are designed to detect risks in prompts and responses.
☆72Updated last week
Alternatives and similar repositories for granite-guardian:
Users that are interested in granite-guardian are comparing it to the libraries listed below
- Code repo for "Agent Instructs Large Language Models to be General Zero-Shot Reasoners"☆104Updated 6 months ago
- Codebase accompanying the Summary of a Haystack paper.☆75Updated 6 months ago
- Functional Benchmarks and the Reasoning Gap☆84Updated 5 months ago
- DSBench: How Far are Data Science Agents from Becoming Data Science Experts?☆45Updated last month
- Simple examples using Argilla tools to build AI☆53Updated 4 months ago
- ☆111Updated last month
- ☆50Updated 4 months ago
- Evaluating LLMs with CommonGen-Lite☆89Updated last year
- Systematic evaluation framework that automatically rates overthinking behavior in large language models.☆80Updated last month
- Official repo for NAACL 2024 Findings paper "LeTI: Learning to Generate from Textual Interactions."☆63Updated last year
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆165Updated 3 weeks ago
- MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents [EMNLP 2024]☆136Updated 2 months ago
- Public code repo for paper "SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales"☆101Updated 6 months ago
- Lean implementation of various multi-agent LLM methods, including Iteration of Thought (IoT)☆107Updated last month
- ☆106Updated last week
- Beating the GAIA benchmark with Transformers Agents. 🚀☆103Updated last month
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆52Updated last week
- ☆255Updated 3 months ago
- Framework and toolkits for building and evaluating collaborative agents that can work together with humans.☆70Updated last month
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆52Updated 3 months ago
- Complex Function Calling Benchmark.☆85Updated 2 months ago
- ☆74Updated last year
- The code for the paper ROUTERBENCH: A Benchmark for Multi-LLM Routing System☆109Updated 9 months ago
- Open Implementations of LLM Analyses☆103Updated 5 months ago
- ☆160Updated 7 months ago
- Code for our paper PAPILLON: PrivAcy Preservation from Internet-based and Local Language MOdel ENsembles☆22Updated 2 months ago
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆91Updated 2 months ago
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆104Updated 3 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆55Updated 6 months ago
- Train your own SOTA deductive reasoning model☆81Updated 3 weeks ago