☆18Mar 25, 2024Updated last year
Alternatives and similar repositories for SimpleSafetyTests
Users that are interested in SimpleSafetyTests are comparing it to the libraries listed below
Sorting:
- Code and data for the paper "Steering Conversational Large Language Models for Long Emotional Support Conversations" along with a UI to v…☆15Apr 14, 2025Updated 11 months ago
- Data synthesis code for "AGENT: A Benchmark for Core Psychological Reasoning"☆24Mar 3, 2022Updated 4 years ago
- [EMNLP 2024] ”ESC-Eval: Evaluating Emotion Support Conversations in Large Language Models“☆26Jun 24, 2024Updated last year
- Reduction Server in Rust☆13Apr 9, 2024Updated last year
- autoredteam: code for training models that automatically red team other language models☆15Aug 9, 2023Updated 2 years ago
- ICLR2024 Paper. Showing properties of safety tuning and exaggerated safety.☆93May 9, 2024Updated last year
- Official implementation of our LREC-COLING 2024 paper "Generative Multimodal Entity Linking".☆36Feb 27, 2025Updated last year
- Tools for robustness evaluation in interpretability methods☆11Jun 25, 2021Updated 4 years ago
- This repository contains the training and evaluation code for llm-jp-modernbert-base.☆15Jun 17, 2025Updated 9 months ago
- ☆10Nov 28, 2023Updated 2 years ago
- Your finetuned model's back to its original safety standards faster than you can say "SafetyLock"!☆11Oct 16, 2024Updated last year
- From Accuracy to Robustness: A Study of Rule- and Model-based Verifiers in Mathematical Reasoning.☆25Oct 7, 2025Updated 5 months ago
- CVPR 2019 paper "Disentangling Adversarial Robustness and Generalization".☆14Oct 28, 2019Updated 6 years ago
- [WSDM 2026] LookAhead Tuning: Safer Language Models via Partial Answer Previews☆17Dec 14, 2025Updated 3 months ago
- helper functions for processing and integrating visual language information with Qwen-VL Series Model☆17Aug 30, 2024Updated last year
- ☆15Oct 23, 2023Updated 2 years ago
- ☆16May 16, 2025Updated 10 months ago
- ☆18Apr 7, 2025Updated 11 months ago
- ☆16Mar 22, 2025Updated 11 months ago
- [ICLR 2025] Code&Data for the paper "Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization"☆14Jun 21, 2024Updated last year
- ☆11Jan 19, 2025Updated last year
- Multilingual safety benchmark for Large Language Models☆53Sep 1, 2024Updated last year
- Tasks for describing differences between text distributions.☆17Aug 9, 2024Updated last year
- ☆21Jan 11, 2023Updated 3 years ago
- 免费的AI视频生成nonebot插件,支持文生视频和图文生视频☆10May 7, 2025Updated 10 months ago
- ☆13Sep 12, 2024Updated last year
- Checkpointable dataset utilities for foundation model training☆32Jan 29, 2024Updated 2 years ago
- 使用torch.distributed实现DP/TP/PP☆13Dec 28, 2023Updated 2 years ago
- ☆15Mar 22, 2024Updated 2 years ago
- [ICML 2023] "Data Efficient Neural Scaling Law via Model Reusing" by Peihao Wang, Rameswar Panda, Zhangyang Wang☆14Jan 4, 2024Updated 2 years ago
- ☆20Nov 15, 2024Updated last year
- [ICLR 2026] "When AI Agents Collude Online: Financial Fraud Risks by Collaborative LLM Agents on Social Platforms"☆33Feb 3, 2026Updated last month
- 本项目利用深度学习技术,实时检测人体3D姿态,并基于此预测未来人体动作。采用mmpose框架与多进程技术实现后端快速预测,利用混合现实Hololens2头戴显示器显示人物动作,做到实时抓取,实时预测,实时显示。☆12Oct 30, 2023Updated 2 years ago
- Open-source evaluation toolkit of large vision-language models (LVLMs), support ~100 VLMs, 30+ benchmarks☆15Feb 17, 2025Updated last year
- ☆12Nov 15, 2022Updated 3 years ago
- ☆43Jul 10, 2024Updated last year
- 🐴🐘 Data on Members of the 116th U.S. Congress☆10Dec 11, 2019Updated 6 years ago
- Named Entity Recognition for Danish☆17Jul 23, 2019Updated 6 years ago
- [EMNLP 2025 Findings] Retrieval-Augmented Machine Translation with Unstructured Knowledge☆14Sep 4, 2025Updated 6 months ago