ibm-granite / granite-guardianLinks

The Granite Guardian models are designed to detect risks in prompts and responses.

☆93

Alternatives and similar repositories for granite-guardian

Users that are interested in granite-guardian are comparing it to the libraries listed below

Sorting:

ibm-granite / granite-3.0-language-models
☆261Updated last month
microsoft / eureka-ml-insights
A framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings.
☆165Updated last week
aymeric-roucher / GAIA
Beating the GAIA benchmark with Transformers Agents. 🚀
☆131Updated 5 months ago
Columbia-NLP-Lab / PAPILLON
Code for our paper PAPILLON: PrivAcy Preservation from Internet-based and Local Language MOdel ENsembles
☆53Updated 3 months ago
haizelabs / dspy-redteam
Red-Teaming Language Models with DSPy
☆203Updated 5 months ago
Xalp / ECHO
Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)
☆91Updated 6 months ago
Liyan06 / MiniCheck
MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents [EMNLP 2024]
☆174Updated 7 months ago
princeton-pli / hal-harness
☆102Updated this week
HishamAlyahya / semantic_backprop
Source code of "How to Correctly do Semantic Backpropagation on Language-based Agentic Systems" 🤖
☆72Updated 8 months ago
LLMSELECTOR / LLMSELECTOR
☆73Updated 5 months ago
invariantlabs-ai / explorer
A better way of testing, inspecting, and analyzing AI Agent traces.
☆39Updated last month
SalesforceAIResearch / CRMArena
Official Repo for CRMArena and CRMArena-Pro
☆104Updated last month
VectorInstitute / fed-rag
A framework for fine-tuning retrieval-augmented generation (RAG) systems.
☆123Updated 3 weeks ago
goncalorafaria / qalign
QAlign is a new test-time alignment approach that improves language model performance by using Markov chain Monte Carlo methods.
☆23Updated 4 months ago
weaviate / structured-rag
Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models
☆111Updated 3 months ago
DeepSoftwareAnalytics / Awesome-Agent4SE
☆96Updated 10 months ago
rungalileo / hallucination-index
Initiative to evaluate and rank the most popular LLMs across common task types based on their propensity to hallucinate.
☆113Updated last week
patronus-ai / Lynx-hallucination-detection
☆41Updated last year
argilla-io / argilla-cookbook
Simple examples using Argilla tools to build AI
☆53Updated 8 months ago
MinorJerry / OpenWebVoyager
☆78Updated 9 months ago
IBM / unitxt
🦄 Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the world's largest catalog of tools and data …
☆206Updated this week
apple / ToolSandbox
☆194Updated 11 months ago
aymeric-roucher / agent_reasoning_benchmark
🔧 Compare how Agent systems perform on several benchmarks. 📊🚀
☆99Updated this week
ServiceNow / Fast-LLM
Accelerating your LLM training to full speed! Made with ❤️ by ServiceNow Research
☆218Updated this week
METR / eval-analysis-public
Public repository containing METR's DVC pipeline for eval data analysis
☆86Updated 4 months ago
AgnostiqHQ / multi-agent-llm
Lean implementation of various multi-agent LLM methods, including Iteration of Thought (IoT)
☆118Updated 5 months ago
facebookresearch / collaborative-reasoner
Source code for the collaborative reasoner research project at Meta FAIR.
☆99Updated 3 months ago
facebookresearch / matrix
Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…
☆81Updated last week
microsoft / lost_in_conversation
Code that accompanies the public release of the paper Lost in Conversation (https://arxiv.org/abs/2505.06120)
☆148Updated last month
sony / talkhier
Official Repo for The Paper "Talk Structurally, Act Hierarchically: A Collaborative Framework for LLM Multi-Agent Systems"
☆56Updated 5 months ago