Flames is a highly adversarial benchmark in Chinese for LLM's harmlessness evaluation developed by Shanghai AI Lab and Fudan NLP Group.
☆63May 21, 2024Updated last year
Alternatives and similar repositories for Flames
Users that are interested in Flames are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆17Mar 22, 2024Updated 2 years ago
- ☆45Jun 19, 2025Updated 10 months ago
- [EMNLP 2024] ”ESC-Eval: Evaluating Emotion Support Conversations in Large Language Models“☆26Jun 24, 2024Updated last year
- S-Eval: Towards Automated and Comprehensive Safety Evaluation for Large Language Models☆114Feb 13, 2026Updated 2 months ago
- ☆30Aug 9, 2023Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆21Aug 19, 2024Updated last year
- 面向中文大模型价值观的评估与对齐研究☆557Jul 20, 2023Updated 2 years ago
- Repo for paper: Examining LLMs' Uncertainty Expression Towards Questions Outside Parametric Knowledge☆14Feb 20, 2024Updated 2 years ago
- GAOGAO-Bench-Updates is a supplement to the GAOKAO-Bench, a dataset to evaluate large language models.☆41Jan 7, 2025Updated last year
- ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors [EMNLP 2024 Findings]☆227Sep 29, 2024Updated last year
- ☆17Oct 15, 2023Updated 2 years ago
- SC-Safety: 中文大模型多轮对抗安全基准☆150Mar 15, 2024Updated 2 years ago
- Official github repo for SafetyBench, a comprehensive benchmark to evaluate LLMs' safety. [ACL 2024]☆283Jul 28, 2025Updated 9 months ago
- An active inference model of Lacanian psychoanalysis☆17Jun 7, 2025Updated 10 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Accepted by ECCV 2024☆203Oct 15, 2024Updated last year
- [EMNLP 2023 Demo] "CLEVA: Chinese Language Models EVAluation Platform"☆64May 16, 2025Updated 11 months ago
- ☆30Feb 16, 2024Updated 2 years ago
- ☆17Nov 3, 2024Updated last year
- ☆29Oct 14, 2021Updated 4 years ago
- ☆14Oct 7, 2022Updated 3 years ago
- This is the code repository for "Uncovering Safety Risks of Large Language Models through Concept Activation Vector"☆48Oct 13, 2025Updated 6 months ago
- ☆43Jun 25, 2025Updated 10 months ago
- ☆14Aug 7, 2025Updated 8 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Dataset and evaluation script for "Evaluating Hallucinations in Chinese Large Language Models"☆139Jun 5, 2024Updated last year
- [ICLR 2025] Official implementation for "SafeWatch: An Efficient Safety-Policy Following Video Guardrail Model with Transparent Explanati…☆44Feb 11, 2025Updated last year
- [ICML 2025] Official repository for paper "OR-Bench: An Over-Refusal Benchmark for Large Language Models"☆26Mar 4, 2025Updated last year
- GSM-Plus: Data, Code, and Evaluation for Enhancing Robust Mathematical Reasoning in Math Word Problems.☆64Jul 8, 2024Updated last year
- A toolbox for benchmarking trustworthiness of multimodal large language models (MultiTrust, NeurIPS 2024 Track Datasets and Benchmarks)☆173Jun 27, 2025Updated 10 months ago
- Math24o: 高中奥林匹克数学竞赛测评集 High School Olympiad Mathematics Chinese Benchmark☆11Mar 27, 2025Updated last year
- [ISSTA'24] A Large-Scale Dataset Capable of Enhancing the Prowess of Large Language Models for Program Testing☆12Jan 7, 2025Updated last year
- BeaverTails is a collection of datasets designed to facilitate research on safety alignment in large language models (LLMs).☆180Oct 27, 2023Updated 2 years ago
- Reimplementation of SALICON saliency model in Pytorch☆11Oct 3, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Code for Findings-EMNLP 2023 paper: Multi-step Jailbreaking Privacy Attacks on ChatGPT☆36Oct 15, 2023Updated 2 years ago
- Improved techniques for optimization-based jailbreaking on large language models (ICLR2025)☆144Apr 7, 2025Updated last year
- ☆10Mar 19, 2024Updated 2 years ago
- FlagEval is an evaluation toolkit for AI large foundation models.☆337Apr 24, 2025Updated last year
- Official repository for the paper "Gradient-based Jailbreak Images for Multimodal Fusion Models" (https//arxiv.org/abs/2410.03489)☆19Oct 22, 2024Updated last year
- ☆129Feb 3, 2025Updated last year
- Codes and data for CIKM 2022 paper "RuDi: Explaining Behavior Sequence Models by Automatic Statistics Generation and Rule Distillation"☆12Aug 16, 2022Updated 3 years ago