bertiev/SimpleSafetyTests

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/bertiev/SimpleSafetyTests)

bertiev / SimpleSafetyTests

☆19

Alternatives and similar repositories for SimpleSafetyTests

Users that are interested in SimpleSafetyTests are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

haidequanbu / ESC-Eval
View on GitHub
[EMNLP 2024] ”ESC-Eval: Evaluating Emotion Support Conversations in Large Language Models“
☆27Jun 24, 2024Updated 2 years ago
MicroSTM / AGENT-synthesis
View on GitHub
Data synthesis code for "AGENT: A Benchmark for Core Psychological Reasoning"
☆24Mar 3, 2022Updated 4 years ago
leondz / autoredteam
View on GitHub
autoredteam: code for training models that automatically red team other language models
☆17Aug 9, 2023Updated 2 years ago
HITsz-TMG / GEMEL
View on GitHub
Official implementation of our LREC-COLING 2024 paper "Generative Multimodal Entity Linking".
☆36Feb 27, 2025Updated last year
patronus-ai / trail-benchmark
View on GitHub
☆21May 14, 2025Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
choidami / inductive-oocr
View on GitHub
☆16Mar 22, 2025Updated last year
SqrtiZhang / openreview_ICRL2024_analysis
View on GitHub
☆10Nov 28, 2023Updated 2 years ago
yzhang1918 / cikm2022rudi
View on GitHub
Codes and data for CIKM 2022 paper "RuDi: Explaining Behavior Sequence Models by Automatic Statistics Generation and Rule Distillation"
☆12Aug 16, 2022Updated 3 years ago
kq-chen / qwen-vl-utils
View on GitHub
helper functions for processing and integrating visual language information with Qwen-VL Series Model
☆17Aug 30, 2024Updated last year
CUHK-Shenzhen-SE / RetromorphicTesting
View on GitHub
☆11Jan 19, 2025Updated last year
keven980716 / weak-to-strong-deception
View on GitHub
[ICLR 2025] Code&Data for the paper "Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization"
☆15Jun 21, 2024Updated 2 years ago
RobustNLP / TestNER
View on GitHub
A toolkit for testing and improving named entity recognition [ESEC/FSE'23]
☆11Aug 31, 2023Updated 2 years ago
zsdonghao / zsdonghao.github.io
View on GitHub
Click this --> https://zsdonghao.github.io
☆10Updated this week
Jarviswang94 / Multilingual_safety_benchmark
View on GitHub
Multilingual safety benchmark for Large Language Models
☆53Sep 1, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
gabry1998 / Self-Supervised-Anomaly-Detection
View on GitHub
Thesis project about Visual Anomaly Detection based on Self Supervised Learning. The model identifies anomalies from information acquired…
☆10Apr 14, 2023Updated 3 years ago
Carol-gutianle / Awesome-llm-unlearning
View on GitHub
☆13Jun 17, 2024Updated 2 years ago
robalynch1122 / LLMSeasonalityTesting
View on GitHub
☆13Sep 12, 2024Updated last year
Alpaca4610 / nonebot_plugin_cogvideox
View on GitHub
免费的AI视频生成nonebot插件，支持文生视频和图文生视频
☆10May 7, 2025Updated last year
VITA-Group / Data-Efficient-Scaling
View on GitHub
[ICML 2023] "Data Efficient Neural Scaling Law via Model Reusing" by Peihao Wang, Rameswar Panda, Zhangyang Wang
☆14Jan 4, 2024Updated 2 years ago
XiaojuanTang / ICSR
View on GitHub
implementation of paper "Large Language Models are In-Context Semantic Reasoners rather than Symbolic Reasoners"
☆20Aug 17, 2023Updated 2 years ago
Trustworthy-ML-Lab / Linear-Explanations
View on GitHub
[ICML 24] A novel automated neuron explanation framework that can accurately describe poly-semantic concepts in deep neural networks
☆14May 2, 2025Updated last year
KomeijiForce / Active_Passive_Constraint_Koishiday_2024
View on GitHub
Koishi's Day 2024 Paper (NeurIPS 2024): An advanced persona-driven role-playing system with global faithfulness quantification and optimi…
☆13Oct 19, 2025Updated 9 months ago
Optifine-TAT / CliMedBench
View on GitHub
☆12Nov 14, 2024Updated last year
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
krystalan / RAGtrans
View on GitHub
[EMNLP 2025 Findings] Retrieval-Augmented Machine Translation with Unstructured Knowledge
☆15Sep 4, 2025Updated 10 months ago
hackersground-kr / hackers-ground
View on GitHub
해커그라운드 해커톤 2024
☆12Aug 26, 2024Updated last year
kq-chen / VLMEvalKit
View on GitHub
Open-source evaluation toolkit of large vision-language models (LVLMs), support ~100 VLMs, 30+ benchmarks
☆15Feb 17, 2025Updated last year
OpenSafetyLab / SALAD-BENCH
View on GitHub
【ACL 2024】 SALAD benchmark & MD-Judge
☆176Mar 8, 2025Updated last year
eric-mitchell / concord
View on GitHub
☆14Nov 15, 2022Updated 3 years ago
abdulhaim / moral_foundations_llms
View on GitHub
☆16Oct 23, 2023Updated 2 years ago
LoryPack / LLM-LieDetector
View on GitHub
Code for the ICLR 2024 paper "How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions"
☆74Jun 19, 2024Updated 2 years ago
shadowkiller33 / Contrast-Instruction
View on GitHub
☆19Oct 2, 2023Updated 2 years ago
GateNLP / gateplugin-LearningFramework
View on GitHub
A plugin for the GATE language technology framework for training and using machine learning models. Currently supports Mallet (MaxEnt, N…
☆29Apr 17, 2023Updated 3 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
Sxxxw / BinaryLLMs-Eval
View on GitHub
[EMSE'26] An Empirical Study on the Effectiveness of Large Language Models for Binary Code Understanding
☆15Aug 19, 2025Updated 11 months ago
dteodore / EmotionArcs
View on GitHub
☆11Mar 26, 2024Updated 2 years ago
hanningzhang / prm
View on GitHub
☆17Nov 3, 2024Updated last year
CyberAgentAILab / regularized-bon
View on GitHub
Code of "Regularized Best-of-N Sampling with Minimum Bayes Risk Objective for Language Model Alignment" (2025).
☆14Apr 4, 2025Updated last year
TransluceAI / .github
View on GitHub
☆19Dec 12, 2025Updated 7 months ago
qcznlp / uncertainty_attack
View on GitHub
☆23Sep 2, 2025Updated 10 months ago
EventStudyTools / api-wrapper.r
View on GitHub
☆10Jul 7, 2026Updated 2 weeks ago