cisco-open / modelsmith
A toolkit for optimizing machine learning models for practical applications
☆26Updated this week
Alternatives and similar repositories for modelsmith:
Users that are interested in modelsmith are comparing it to the libraries listed below
- Repo for the research paper "Aligning LLMs to Be Robust Against Prompt Injection"☆32Updated last month
- [SafeGenAi @ NeurIPS 2024] Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates☆67Updated 2 months ago
- Code and data to go with the Zhu et al. paper "An Objective for Nuanced LLM Jailbreaks"☆21Updated last month
- Official repository for "Robust Prompt Optimization for Defending Language Models Against Jailbreaking Attacks"☆48Updated 5 months ago
- Private Evolution: Generating DP Synthetic Data without Training [ICLR 2024, ICML 2024]☆84Updated this week
- Code for the paper "Evading Black-box Classifiers Without Breaking Eggs" [SaTML 2024]☆19Updated 9 months ago
- ☆31Updated last year
- Jailbreak artifacts for JailbreakBench☆46Updated 2 months ago
- The official repository of the paper "On the Exploitability of Instruction Tuning".☆58Updated 11 months ago
- ☆39Updated last year
- Fluent student-teacher redteaming☆19Updated 5 months ago
- [ICLR 2022] Boosting Randomized Smoothing with Variance Reduced Classifiers☆12Updated 2 years ago
- ☆15Updated 10 months ago
- ☆21Updated 4 months ago
- LLM Self Defense: By Self Examination, LLMs know they are being tricked☆31Updated 8 months ago
- Package to optimize Adversarial Attacks against (Large) Language Models with Varied Objectives☆66Updated 10 months ago
- ☆39Updated last year
- Code related to ’Beyond spectral gap: The role of the topology in decentralized learning‘.☆13Updated 2 years ago
- Code for safety test in "Keeping LLMs Aligned After Fine-tuning: The Crucial Role of Prompt Templates"☆17Updated 10 months ago
- Implementation of paper 'Reversing the Forget-Retain Objectives: An Efficient LLM Unlearning Framework from Logit Difference' [NeurIPS'24…☆16Updated 7 months ago
- Independent robustness evaluation of Improving Alignment and Robustness with Short Circuiting☆13Updated 5 months ago
- ☆15Updated last month
- A modern look at the relationship between sharpness and generalization [ICML 2023]☆43Updated last year
- The official implementation of our pre-print paper "Automatic and Universal Prompt Injection Attacks against Large Language Models".☆39Updated 2 months ago
- A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.☆74Updated this week
- UnifiedUncertaintyCalibration☆11Updated last year
- ☆89Updated last year
- Implementation of PaCE: Parsimonious Concept Engineering for Large Language Models (NeurIPS 2024)☆31Updated 2 months ago
- Privacy backdoors☆51Updated 8 months ago
- ☆27Updated last year