cisco-open / polygraphLLMLinks
A library for detecting hallucination and improving LLM factuality
☆17Updated 3 months ago
Alternatives and similar repositories for polygraphLLM
Users that are interested in polygraphLLM are comparing it to the libraries listed below
Sorting:
- ☆49Updated last year
- RAI is a python library that is written to help AI developers in various aspects of responsible AI development.☆63Updated last year
- The Granite Guardian models are designed to detect risks in prompts and responses.☆122Updated 2 months ago
- Run safety benchmarks against AI models and view detailed reports showing how well they performed.☆112Updated this week
- A framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings.☆173Updated 2 weeks ago
- A method for steering llms to better follow instructions☆66Updated 4 months ago
- A toolkit for optimizing machine learning models for practical applications☆31Updated 9 months ago
- Collection of evals for Inspect AI☆305Updated this week
- Fiddler Auditor is a tool to evaluate language models.☆188Updated last year
- LangFair is a Python library for conducting use-case level LLM bias and fairness assessments☆242Updated last week
- A comprehensive guide to LLM evaluation methods designed to assist in identifying the most suitable evaluation techniques for various use…☆160Updated 2 weeks ago
- Moonshot - A simple and modular tool to evaluate and red-team any LLM application.☆293Updated this week
- ☆144Updated 3 months ago
- WorkBench: a Benchmark Dataset for Agents in a Realistic Workplace Setting.☆54Updated last year
- This is the repository for composable NLP Inference Pipeline tool Blaze☆31Updated last year
- Accelerating your LLM training to full speed! Made with ❤️ by ServiceNow Research☆270Updated this week
- ☆43Updated last year
- ☆226Updated last month
- Red-Teaming Language Models with DSPy☆244Updated 10 months ago
- Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs☆97Updated last year
- Improving Alignment and Robustness with Circuit Breakers☆248Updated last year
- ☆49Updated last year
- Language Model for Mainframe Modernization☆62Updated last year
- ☆200Updated 2 weeks ago
- 🦄 Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the world's largest catalog of tools and data …☆212Updated last week
- 🚀 Collection of tuning recipes with HuggingFace SFTTrainer and PyTorch FSDP.☆53Updated last week
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆189Updated 9 months ago
- LLM Comparator is an interactive data visualization tool for evaluating and analyzing LLM responses side-by-side, developed by the PAIR t…☆503Updated 10 months ago
- Contains random samples referenced in the paper "Sleeper Agents: Training Robustly Deceptive LLMs that Persist Through Safety Training".☆122Updated last year
- Sample notebooks and prompts for LLM evaluation☆156Updated last month