☆15Apr 27, 2024Updated last year
Alternatives and similar repositories for CASPER
Users that are interested in CASPER are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Prolog specification of TensorFlow layers☆14Jun 12, 2023Updated 2 years ago
- ☆27Feb 1, 2023Updated 3 years ago
- Code repo of our paper Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis (https://arxiv.org/abs/2406.10794…☆24Jul 26, 2024Updated last year
- ☆33Jun 24, 2024Updated last year
- A series of BERT and Albert model checkpoints trained to reduce gendered correlations in pre-training☆11Oct 22, 2020Updated 5 years ago
- ☆14Sep 7, 2022Updated 3 years ago
- ☆11Sep 4, 2017Updated 8 years ago
- ☆14Aug 7, 2025Updated 7 months ago
- Adversarial Attack for Pre-trained Code Models☆10Jul 19, 2022Updated 3 years ago
- ☆52May 24, 2023Updated 2 years ago
- ☆20May 31, 2024Updated last year
- Code for "Astraea: Grammar-based Fairness Testing"☆10Jan 7, 2022Updated 4 years ago
- Llama中文社区,最好的中文Llama大模型,完全开源可商用☆12Aug 5, 2023Updated 2 years ago
- AndroidSlicer is a dynamic slicing tool, useful for a variety of tasks, from testing to debugging to security.☆14Jul 28, 2019Updated 6 years ago
- DSN jailbreak Attack & Evaluation Ensemble☆17Feb 7, 2026Updated last month
- Dataset and code for the CLEVR-XAI dataset.☆33Oct 3, 2023Updated 2 years ago
- Intersectional bias in hate speech and abusive language datasets☆15Jan 25, 2024Updated 2 years ago
- Official repository for the paper "Gradient-based Jailbreak Images for Multimodal Fusion Models" (https//arxiv.org/abs/2410.03489)☆19Oct 22, 2024Updated last year
- [Neurips 2025]StegoZip: Enhancing Linguistic Steganography Payload in Practice with Large Language Models☆29Dec 4, 2025Updated 3 months ago
- 1.0☆13Jun 7, 2025Updated 9 months ago
- Can We Trust Large Language Models?: A Benchmark for Responsible Large Language Models via Toxicity, Bias, and Value-alignment Evaluation☆26Oct 12, 2023Updated 2 years ago
- ☆17May 18, 2021Updated 4 years ago
- This is the implementation for IEEE S&P 2022 paper "Model Orthogonalization: Class Distance Hardening in Neural Networks for Better Secur…☆11Aug 24, 2022Updated 3 years ago
- [NDSS'25] The official implementation of safety misalignment.☆17Jan 8, 2025Updated last year
- ☆14Mar 9, 2025Updated last year
- [CVPR2025] Official Repository for IMMUNE: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment☆27Jun 11, 2025Updated 9 months ago
- ☆20Aug 26, 2018Updated 7 years ago
- Data set for LREC 2020 paper "I Feel Offended, Don't Be Abusive!"☆18Sep 23, 2023Updated 2 years ago
- Mandoline is an accurate, low-overhead dynamic slicer for Android applicaions.☆11Dec 24, 2025Updated 3 months ago
- Pytorch implementation for the pilot study on the robustness of latent diffusion models.☆12Jun 20, 2023Updated 2 years ago
- [EMNLP 2025] Reasoning-to-Defend: Safety-Aware Reasoning Can Defend Large Language Models from Jailbreaking☆12Aug 22, 2025Updated 7 months ago
- Official repository of paper "Let All be Whitened: Multi-teacher Distillation for Efficient Visual Retrieval"☆10Dec 20, 2023Updated 2 years ago
- CCS 2023 | Explainable malware and vulnerability detection with XAI in paper "FINER: Enhancing State-of-the-art Classifiers with Feature …☆11Aug 20, 2024Updated last year
- ☆11Sep 10, 2024Updated last year
- Audio processing by using pytorch 1D convolution network (based on nnAudio). Gammatone Spectrogram and SpecAugmentation are now available…☆20Nov 30, 2020Updated 5 years ago
- Learning Certified Individually Fair Representations☆24Nov 7, 2020Updated 5 years ago
- Interpretable unified language safety checking with large language models☆32Apr 15, 2023Updated 2 years ago
- [ACL 2025] LongSafety: Evaluating Long-Context Safety of Large Language Models☆16Jun 18, 2025Updated 9 months ago
- ☆26Nov 7, 2022Updated 3 years ago