whitecircle-ai/circle-guard-bench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/whitecircle-ai/circle-guard-bench)

whitecircle-ai / circle-guard-bench

First-of-its-kind AI benchmark for evaluating the protection capabilities of large language model (LLM) guard systems (guardrails and safeguards)

☆52

Alternatives and similar repositories for circle-guard-bench

Users that are interested in circle-guard-bench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

VikhrModels / mctslib
View on GitHub
☆31Sep 23, 2024Updated last year
VikhrModels / DOoM
View on GitHub
Бенчмарк для оценки способности языковых моделей решать математические и физические задачи на русском языке
☆22Nov 14, 2025Updated 5 months ago
VikhrModels / ru_llm_arena
View on GitHub
Modified Arena-Hard-Auto LLM evaluation toolkit with an emphasis on Russian language
☆47Mar 20, 2025Updated last year
humane-intelligence / ai_village_defcon_grt_data
View on GitHub
☆15Jun 7, 2024Updated last year
weizeming / momentum-attack-llm
View on GitHub
☆25Jan 17, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Sxela / YADA
View on GitHub
Yet Another Diffusion Automation
☆13Aug 21, 2022Updated 3 years ago
RapidResponseBench / rapidresponsebench
View on GitHub
☆34Nov 12, 2024Updated last year
SibNN / asr_eval
View on GitHub
Evaluation tools for Automatic Speech Recognition (ASR), model and dataset collection
☆31Mar 9, 2026Updated last month
alzobnin / hse-cs-prog
View on GitHub
☆31Feb 6, 2020Updated 6 years ago
tommccoy1 / embers-of-autoregression
View on GitHub
☆29Dec 30, 2024Updated last year
IlyaGusev / holosophos
View on GitHub
Tools and agents for automated research.
☆53Dec 5, 2025Updated 4 months ago
VikhrModels / effective_llm_alignment
View on GitHub
Effective LLM Alignment Toolkit
☆153Jun 25, 2025Updated 9 months ago
NaturalCycles / MixBABA
View on GitHub
A tool for making AB tests with Mixpanel API
☆12Jan 25, 2019Updated 7 years ago
safety-research / finetuning-auditor
View on GitHub
Auditing agents for fine-tuning safety
☆20Oct 21, 2025Updated 5 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
birshert / HSE-optimization-exam
View on GitHub
Конспекты для подготовки к экзамену по курсу Непрерывная оптимизация 2020 для специализации МОП ПМИ ФКН ВШЭ.
☆11Jun 26, 2020Updated 5 years ago
martysai / artificial-text-detection
View on GitHub
Python framework for artificial text detection: NLP approaches to compare natural text against generated by neural networks.
☆16Sep 5, 2023Updated 2 years ago
chai-research / lmgym
View on GitHub
Code base for internal reward models and PPO training
☆24Oct 1, 2023Updated 2 years ago
thu-coai / Agent-SafetyBench
View on GitHub
☆126Aug 11, 2025Updated 8 months ago
robomotic / awesome-guide-ai-safety
View on GitHub
☆12Jun 7, 2025Updated 10 months ago
kwebio / kweb-demos
View on GitHub
Simple projects that demonstrate kweb's capabilities 🦆
☆13Aug 2, 2021Updated 4 years ago
birshert / HSE-differential-equations
View on GitHub
HSE AMI course of differential equations
☆10Jun 17, 2019Updated 6 years ago
haizelabs / bijection-learning
View on GitHub
☆28Oct 22, 2024Updated last year
WangHanLinHenry / STeCa
View on GitHub
(ACL2025 Findings) Official code for the paper "STeCa: Step-level Trajectory Calibration for LLM Agent Learning"
☆26Mar 2, 2026Updated last month
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
Bots-Avatar / ExplainitAll
View on GitHub
ExplainitAll — это библиотека для интерпретируемого ИИ, предназначенная для интерпретации генеративных моделей ( GPT-like), и векторизато…
☆19Oct 11, 2024Updated last year
alexmavr / promptsage
View on GitHub
Promptsage is an LLM prompt builder, linter and sanitizer with built-in guardrails
☆23Mar 25, 2024Updated 2 years ago
princeton-polaris-lab / Evaluating-Durable-Safeguards
View on GitHub
[ICLR 2025] On Evluating the Durability of Safegurads for Open-Weight LLMs
☆13Jun 20, 2025Updated 9 months ago
TransluceAI / jailbreaking-frontier-models
View on GitHub
☆25Sep 3, 2025Updated 7 months ago
aniket-work / AI_Powered_Dev_Search_Engine
View on GitHub
AI_Powered_Dev_Search_Engine
☆12Mar 10, 2024Updated 2 years ago
DS3Lab / CocktailSGD
View on GitHub
☆27Aug 25, 2023Updated 2 years ago
Aloriosa / srmt
View on GitHub
The original Shared Recurrent Memory Transformer implementation
☆33Jul 11, 2025Updated 9 months ago
rubenpt91 / PFL-DocVQA-Competition
View on GitHub
☆21Oct 23, 2024Updated last year
lutzroeder / agents
View on GitHub
Minimal coding, computer-use and deep research agents using the OpenAI Agents SDK
☆34Mar 9, 2026Updated last month
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
KEAML-JLU / DADGNN
View on GitHub
The source code of "Deep attention diffusion graph neural networks for text classification"
☆13Nov 11, 2023Updated 2 years ago
invariantlabs-ai / invariant
View on GitHub
Guardrails for secure and robust agent development
☆411Jan 12, 2026Updated 3 months ago
xiamengzhou / training_trajectory_analysis
View on GitHub
[ACL 2023]: Training Trajectories of Language Models Across Scales https://arxiv.org/pdf/2212.09803.pdf
☆25Nov 14, 2023Updated 2 years ago
shehper / AC-Solver
View on GitHub
A long-horizon, sparse-reward math environment for reinforcement learning. Official code repo for "What makes Math problems hard for rein…
☆32Aug 11, 2025Updated 8 months ago
microsoft / llmail-inject-challenge
View on GitHub
Code for the API, workload execution, and agents underlying the LLMail-Inject Adpative Prompt Injection Challenge
☆23Apr 9, 2026Updated last week
jotaf98 / shareddataset
View on GitHub
A PyTorch Dataset that caches samples in shared memory, accessible globally to all processes
☆24May 11, 2022Updated 3 years ago
IlyaGusev / codearkt
View on GitHub
Implementation of the CodeAct agentic framework with Docker containers for security, MCP servers for tool integrations, and multi-agent s…
☆40Oct 22, 2025Updated 5 months ago