Flames is a highly adversarial benchmark in Chinese for LLM's harmlessness evaluation developed by Shanghai AI Lab and Fudan NLP Group.
☆63May 21, 2024Updated 2 years ago
Alternatives and similar repositories for Flames
Users that are interested in Flames are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆17Mar 22, 2024Updated 2 years ago
- S-Eval: Towards Automated and Comprehensive Safety Evaluation for Large Language Models☆116Feb 13, 2026Updated 3 months ago
- ☆31Aug 9, 2023Updated 2 years ago
- ☆21Aug 19, 2024Updated last year
- 面向中文大模型价值观的评估与对齐研究☆556Jul 20, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Repo for paper: Examining LLMs' Uncertainty Expression Towards Questions Outside Parametric Knowledge☆14Feb 20, 2024Updated 2 years ago
- GAOGAO-Bench-Updates is a supplement to the GAOKAO-Bench, a dataset to evaluate large language models.☆44Jan 7, 2025Updated last year
- ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors [EMNLP 2024 Findings]☆231Sep 29, 2024Updated last year
- SC-Safety: 中文大模型多轮对抗安全基准☆151Mar 15, 2024Updated 2 years ago
- Official github repo for SafetyBench, a comprehensive benchmark to evaluate LLMs' safety. [ACL 2024]☆288Jul 28, 2025Updated 10 months ago
- Chinese safety prompts for evaluating and improving the safety of LLMs. 中文安全prompts,用于评估和提升大模型的安全性。☆1,171Feb 27, 2024Updated 2 years ago
- Accepted by ECCV 2024☆208Oct 15, 2024Updated last year
- [EMNLP 2023 Demo] "CLEVA: Chinese Language Models EVAluation Platform"☆64May 16, 2025Updated last year
- ☆30Feb 16, 2024Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆17Nov 3, 2024Updated last year
- ☆31Oct 14, 2021Updated 4 years ago
- ☆14Oct 7, 2022Updated 3 years ago
- ☆15Aug 7, 2025Updated 10 months ago
- Dataset and evaluation script for "Evaluating Hallucinations in Chinese Large Language Models"☆139Jun 5, 2024Updated 2 years ago
- This is the code repository for "Uncovering Safety Risks of Large Language Models through Concept Activation Vector"☆49Oct 13, 2025Updated 7 months ago
- [ICLR 2025] Official implementation for "SafeWatch: An Efficient Safety-Policy Following Video Guardrail Model with Transparent Explanati…☆45Feb 11, 2025Updated last year
- [ICML 2025] Official repository for paper "OR-Bench: An Over-Refusal Benchmark for Large Language Models"☆26Mar 4, 2025Updated last year
- Adversarial Attack for Pre-trained Code Models☆10Jul 19, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- GSM-Plus: Data, Code, and Evaluation for Enhancing Robust Mathematical Reasoning in Math Word Problems.☆66Jul 8, 2024Updated last year
- A toolbox for benchmarking trustworthiness of multimodal large language models (MultiTrust, NeurIPS 2024 Track Datasets and Benchmarks)☆176Jun 27, 2025Updated 11 months ago
- 复旦白泽大模型安全基准测试集(2024年夏季版)☆51Jul 31, 2024Updated last year
- [ISSTA'24] A Large-Scale Dataset Capable of Enhancing the Prowess of Large Language Models for Program Testing☆12Jan 7, 2025Updated last year
- BeaverTails is a collection of datasets designed to facilitate research on safety alignment in large language models (LLMs).☆180Oct 27, 2023Updated 2 years ago
- Reimplementation of SALICON saliency model in Pytorch☆12Oct 3, 2023Updated 2 years ago
- Code for Findings-EMNLP 2023 paper: Multi-step Jailbreaking Privacy Attacks on ChatGPT☆37Oct 15, 2023Updated 2 years ago
- Official repository for ACL 2025 paper "Model Extrapolation Expedites Alignment"☆75May 20, 2025Updated last year
- Improved techniques for optimization-based jailbreaking on large language models (ICLR2025)☆145Apr 7, 2025Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- "他山之石、可以攻玉":复旦JADE团队发布的大模型测评与治理系列☆512May 14, 2026Updated 3 weeks ago
- Math24o: 高中奥林匹克数学竞赛测评集 High School Olympiad Mathematics Chinese Benchmark☆12Mar 27, 2025Updated last year
- ☆10Mar 19, 2024Updated 2 years ago
- CMMLU: Measuring massive multitask language understanding in Chinese☆820Dec 6, 2024Updated last year
- LaTeX thesis template for CS undergraduates, Fudan University, 2022☆24Jul 28, 2024Updated last year
- FlagEval is an evaluation toolkit for AI large foundation models.☆337Apr 24, 2025Updated last year
- Official repository for the paper "Gradient-based Jailbreak Images for Multimodal Fusion Models" (https//arxiv.org/abs/2410.03489)☆20Oct 22, 2024Updated last year