☆19Mar 25, 2024Updated 2 years ago
Alternatives and similar repositories for SimpleSafetyTests
Users that are interested in SimpleSafetyTests are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Reduction Server in Rust☆14Apr 9, 2024Updated 2 years ago
- Example of using Epochraft to train HuggingFace transformers models with PyTorch FSDP☆11Jan 29, 2024Updated 2 years ago
- ICLR2024 Paper. Showing properties of safety tuning and exaggerated safety.☆94May 9, 2024Updated 2 years ago
- Official implementation of our LREC-COLING 2024 paper "Generative Multimodal Entity Linking".☆36Feb 27, 2025Updated last year
- Tools for robustness evaluation in interpretability methods☆10Jun 25, 2021Updated 4 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Your finetuned model's back to its original safety standards faster than you can say "SafetyLock"!☆11Oct 16, 2024Updated last year
- Codes and data for CIKM 2022 paper "RuDi: Explaining Behavior Sequence Models by Automatic Statistics Generation and Rule Distillation"☆12Aug 16, 2022Updated 3 years ago
- From Accuracy to Robustness: A Study of Rule- and Model-based Verifiers in Mathematical Reasoning.☆25Oct 7, 2025Updated 8 months ago
- CVPR 2019 paper "Disentangling Adversarial Robustness and Generalization".☆14Oct 28, 2019Updated 6 years ago
- [WSDM 2026] LookAhead Tuning: Safer Language Models via Partial Answer Previews☆17Dec 14, 2025Updated 5 months ago
- helper functions for processing and integrating visual language information with Qwen-VL Series Model☆17Aug 30, 2024Updated last year
- ☆20Jul 24, 2024Updated last year
- ☆16Oct 23, 2023Updated 2 years ago
- ☆19Apr 7, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆16May 16, 2025Updated last year
- ☆16Mar 22, 2025Updated last year
- ☆11Jan 19, 2025Updated last year
- Thesis project about Visual Anomaly Detection based on Self Supervised Learning. The model identifies anomalies from information acquired…☆10Apr 14, 2023Updated 3 years ago
- Tasks for describing differences between text distributions.☆17Aug 9, 2024Updated last year
- ☆22Jan 11, 2023Updated 3 years ago
- 免费的AI视频生成nonebot插件,支持文生视频和图文生视频☆10May 7, 2025Updated last year
- ☆13Jun 17, 2024Updated last year
- Checkpointable dataset utilities for foundation model training☆32Jan 29, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Notebooks for Scaling Deep Learning Interpretability by Visualizing Activation and Attribution Summarizations☆15Oct 3, 2019Updated 6 years ago
- 使用torch.distributed实现DP/TP/PP☆15Dec 28, 2023Updated 2 years ago
- ☆17Mar 22, 2024Updated 2 years ago
- [ICML 2023] "Data Efficient Neural Scaling Law via Model Reusing" by Peihao Wang, Rameswar Panda, Zhangyang Wang☆14Jan 4, 2024Updated 2 years ago
- ☆20Nov 15, 2024Updated last year
- Koishi's Day 2024 Paper (NeurIPS 2024): An advanced persona-driven role-playing system with global faithfulness quantification and optimi…☆12Oct 19, 2025Updated 7 months ago
- An experiment in trying to define a core and cleaned-up NumPy API: RNumPy☆13Feb 19, 2021Updated 5 years ago
- ☆12Nov 14, 2024Updated last year
- The implementation for our paper, "Improving Simultaneous Machine Translation with Monolingual Data," accepted to AAAI 2023. 🎉☆12Jul 19, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- 【ACL 2024】 SALAD benchmark & MD-Judge☆176Mar 8, 2025Updated last year
- 🐴🐘 Data on Members of the 116th U.S. Congress☆10Dec 11, 2019Updated 6 years ago
- [EMNLP 2025 Findings] Retrieval-Augmented Machine Translation with Unstructured Knowledge☆15Sep 4, 2025Updated 9 months ago
- [NeurIPS 2024] CoSy is an automatic evaluation framework for textual explanations of neurons.☆20Jan 28, 2026Updated 4 months ago
- Psy-Insight: Mental Health Oriented Interpretable Multi-turn Bilingual Counseling Dataset for Large Language Model Finetuning☆29Jan 4, 2026Updated 5 months ago
- An official implementation of "Catastrophic Failure of LLM Unlearning via Quantization" (ICLR 2025)☆38Feb 22, 2025Updated last year
- ☆19Oct 2, 2023Updated 2 years ago