ethz-spylab / superhuman-ai-consistencyView external linksLinks
☆30Jun 19, 2023Updated 2 years ago
Alternatives and similar repositories for superhuman-ai-consistency
Users that are interested in superhuman-ai-consistency are comparing it to the libraries listed below
Sorting:
- ☆51Oct 23, 2023Updated 2 years ago
- Edit and Generate Anything in 3D world!☆13Apr 15, 2023Updated 2 years ago
- Provable Worst Case Guarantees for the Detection of Out-of-Distribution Data☆13Sep 20, 2022Updated 3 years ago
- ☆12Jul 8, 2023Updated 2 years ago
- Code used to run the platform for the LLM CTF colocated with SaTML 2024☆28Mar 20, 2024Updated last year
- Code and dataset for Polyglot Prompting: Multilingual Multitask Prompt Training.☆18Dec 7, 2022Updated 3 years ago
- Implementation for the paper "Learning Invariant Representation for Continual Learning" in PyTorch.☆12Jan 31, 2021Updated 5 years ago
- Code Repository for "A Causal Framework to Quantify the Robustness of Mathematical Reasoning with Language Models".☆15Oct 14, 2022Updated 3 years ago
- Repository for reproducing `Model-Based Robust Deep Learning`☆16Jan 22, 2021Updated 5 years ago
- A Data Source for Reasoning Embodied Agents☆19Sep 18, 2023Updated 2 years ago
- Source code for "Neural Anisotropy Directions"☆16Nov 17, 2020Updated 5 years ago
- Code for the paper "Evading Black-box Classifiers Without Breaking Eggs" [SaTML 2024]☆21Apr 15, 2024Updated last year
- Implementation for <Robust Weight Perturbation for Adversarial Training> in IJCAI'22.☆16Jul 1, 2022Updated 3 years ago
- Independent robustness evaluation of Improving Alignment and Robustness with Short Circuiting☆18Apr 15, 2025Updated 10 months ago
- TACL 2025: Investigating Adversarial Trigger Transfer in Large Language Models☆19Aug 17, 2025Updated 5 months ago
- ☆48Jun 19, 2024Updated last year
- A modern look at the relationship between sharpness and generalization [ICML 2023]☆43Sep 11, 2023Updated 2 years ago
- Class Incremental learning, Task Incremental Learning☆17Dec 19, 2022Updated 3 years ago
- ☆27Aug 27, 2025Updated 5 months ago
- [ECCV 2024] Towards Reliable Evaluation and Fast Training of Robust Semantic Segmentation Models☆21Jul 17, 2024Updated last year
- Distilling Model Failures as Directions in Latent Space☆47Feb 8, 2023Updated 3 years ago
- [CVPR 2024] This repository includes the official implementation our paper "Revisiting Adversarial Training at Scale"☆20Apr 21, 2024Updated last year
- Official Repository for ICML 2023 paper "Can Neural Network Memorization Be Localized?"☆21Oct 26, 2023Updated 2 years ago
- Source code of "Hold me tight! Influence of discriminative features on deep network boundaries"☆21Dec 10, 2021Updated 4 years ago
- Official code for "Decoding-Time Language Model Alignment with Multiple Objectives".☆29Oct 30, 2024Updated last year
- Code for the paper "Pretraining task diversity and the emergence of non-Bayesian in-context learning for regression"☆25Jun 28, 2023Updated 2 years ago
- Code release for the ICML 2019 paper "Are generative classifiers more robust to adversarial attacks?"☆24May 10, 2019Updated 6 years ago
- Data for "Datamodels: Predicting Predictions with Training Data"☆97May 25, 2023Updated 2 years ago
- [ACL 2024 Findings] CriticBench: Benchmarking LLMs for Critique-Correct Reasoning☆30Mar 5, 2024Updated last year
- Official repo for the paper "Make Some Noise: Reliable and Efficient Single-Step Adversarial Training" (https://arxiv.org/abs/2202.01181)☆25Oct 17, 2022Updated 3 years ago
- Fluent student-teacher redteaming☆23Jul 25, 2024Updated last year
- Code for paper "Universal Jailbreak Backdoors from Poisoned Human Feedback"☆66Apr 24, 2024Updated last year
- The repository contains code for Adaptive Data Optimization☆32Dec 9, 2024Updated last year
- Code for T-MARS data filtering☆35Aug 23, 2023Updated 2 years ago
- Influence Estimation for Gradient-Boosted Decision Trees☆29May 27, 2024Updated last year
- Code for RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs. ACL 2023.☆64Nov 27, 2024Updated last year
- Collection of in-progress libraries for entity neural networks.☆30Jun 24, 2022Updated 3 years ago
- AFEC: Active Forgetting of Negative Transfer in Continual Learning (NeurIPS 2021)☆28Sep 26, 2023Updated 2 years ago
- This is the repository for "Model Merging by Uncertainty-Based Gradient Matching", ICLR 2024.☆29May 15, 2024Updated last year