Flames is a highly adversarial benchmark in Chinese for LLM's harmlessness evaluation developed by Shanghai AI Lab and Fudan NLP Group.
☆63May 21, 2024Updated last year
Alternatives and similar repositories for Flames
Users that are interested in Flames are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆16Mar 22, 2024Updated 2 years ago
- ☆45Jun 19, 2025Updated 9 months ago
- S-Eval: Towards Automated and Comprehensive Safety Evaluation for Large Language Models☆111Feb 13, 2026Updated last month
- ☆30Aug 9, 2023Updated 2 years ago
- ☆21Aug 19, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- 面向中文大模型价值观的评估与对齐研究☆555Jul 20, 2023Updated 2 years ago
- SG-Bench: Evaluating LLM Safety Generalization Across Diverse Tasks and Prompt Types☆24Nov 29, 2024Updated last year
- ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors [EMNLP 2024 Findings]☆227Sep 29, 2024Updated last year
- ☆17Oct 15, 2023Updated 2 years ago
- Official github repo for SafetyBench, a comprehensive benchmark to evaluate LLMs' safety. [ACL 2024]☆281Jul 28, 2025Updated 8 months ago
- Chinese safety prompts for evaluating and improving the safety of LLMs. 中文安全prompts,用于评估和提升大模型的安全性。☆1,146Feb 27, 2024Updated 2 years ago
- An active inference model of Lacanian psychoanalysis☆16Jun 7, 2025Updated 10 months ago
- [EMNLP 2023 Demo] "CLEVA: Chinese Language Models EVAluation Platform"☆64May 16, 2025Updated 10 months ago
- ☆30Feb 16, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ☆28Oct 14, 2021Updated 4 years ago
- ☆17Nov 3, 2024Updated last year
- ☆40Jun 25, 2025Updated 9 months ago
- ☆14Oct 7, 2022Updated 3 years ago
- This is the code repository for "Uncovering Safety Risks of Large Language Models through Concept Activation Vector"☆47Oct 13, 2025Updated 5 months ago
- Dataset and evaluation script for "Evaluating Hallucinations in Chinese Large Language Models"☆138Jun 5, 2024Updated last year
- ☆14Aug 7, 2025Updated 8 months ago
- [ICLR 2025] Official implementation for "SafeWatch: An Efficient Safety-Policy Following Video Guardrail Model with Transparent Explanati…☆45Feb 11, 2025Updated last year
- GSM-Plus: Data, Code, and Evaluation for Enhancing Robust Mathematical Reasoning in Math Word Problems.☆64Jul 8, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- A system that turns jailbreak papers into runnable attacks and benchmarks — live, as research evolves.☆26Updated this week
- A toolbox for benchmarking trustworthiness of multimodal large language models (MultiTrust, NeurIPS 2024 Track Datasets and Benchmarks)☆173Jun 27, 2025Updated 9 months ago
- [ISSTA'24] A Large-Scale Dataset Capable of Enhancing the Prowess of Large Language Models for Program Testing☆12Jan 7, 2025Updated last year
- BeaverTails is a collection of datasets designed to facilitate research on safety alignment in large language models (LLMs).☆178Oct 27, 2023Updated 2 years ago
- Code for Findings-EMNLP 2023 paper: Multi-step Jailbreaking Privacy Attacks on ChatGPT☆37Oct 15, 2023Updated 2 years ago
- Official repository for ACL 2025 paper "Model Extrapolation Expedites Alignment"☆75May 20, 2025Updated 10 months ago
- Improved techniques for optimization-based jailbreaking on large language models (ICLR2025)☆143Apr 7, 2025Updated last year
- ☆10Mar 19, 2024Updated 2 years ago
- CMMLU: Measuring massive multitask language understanding in Chinese☆807Dec 6, 2024Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- FlagEval is an evaluation toolkit for AI large foundation models.☆337Apr 24, 2025Updated 11 months ago
- 【ACL 2024】 SALAD benchmark & MD-Judge☆172Mar 8, 2025Updated last year
- ☆127Feb 3, 2025Updated last year
- Official repository for the paper "Gradient-based Jailbreak Images for Multimodal Fusion Models" (https//arxiv.org/abs/2410.03489)☆19Oct 22, 2024Updated last year
- Official github repo for E-Eval, a Chinese K12 education evaluation benchmark for LLMs.☆29Feb 19, 2024Updated 2 years ago
- A tool library for riichi mahjong written in Rust, made mostly to be used as a WASM component.☆13Aug 29, 2025Updated 7 months ago
- ReasoningShield: Safety Detection over Reasoning Traces of Large Reasoning Models☆26Sep 27, 2025Updated 6 months ago