π€« Code and benchmark for our ICLR 2024 spotlight paper: "Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory"
β56Dec 20, 2023Updated 2 years ago
Alternatives and similar repositories for confaide
Users that are interested in confaide are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official code for ICML 2024 paper "Learning to Continually Learn with the Bayesian Principle"β21May 27, 2024Updated last year
- π€ Code for our EMNLP 2020 paper: "Will I Sound Like Me? Improving Persona Consistency in Dialogues through Pragmatic Self-Consciousness"β37Oct 12, 2020Updated 5 years ago
- π» Code and benchmark for our EMNLP 2023 paper - "FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions"β61May 31, 2024Updated last year
- Official implementation of Privacy Implications of Retrieval-Based Language Models (EMNLP 2023). https://arxiv.org/abs/2305.14888β37Jun 10, 2024Updated last year
- Official code for ACL 2023 (short, findings) paper "Recursion of Thought: A Divide and Conquer Approach to Multi-Context Reasoning with Lβ¦β45Jun 13, 2023Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits β’ AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- β28Nov 28, 2023Updated 2 years ago
- [Preprint] On the Effectiveness of Mitigating Data Poisoning Attacks with Gradient Shapingβ10Feb 27, 2020Updated 6 years ago
- [ICLR'24 Spotlight] DP-OPT: Make Large Language Model Your Privacy-Preserving Prompt Engineerβ47May 30, 2024Updated last year
- Code for "CloudLeak: Large-Scale Deep Learning Models Stealing Through Adversarial Examples" (NDSS 2020)β22Nov 14, 2020Updated 5 years ago
- Official Code for ACL 2023 paper: "Ethicist: Targeted Training Data Extraction Through Loss Smoothed Soft Prompting and Calibrated Confidβ¦β23May 8, 2023Updated 3 years ago
- The repository contains the code for analysing the leakage of personally identifiable (PII) information from the output of next word predβ¦β104Aug 13, 2024Updated last year
- Documentation atβ14Mar 27, 2025Updated last year
- Private Adaptive Optimization with Side Information (ICML '22)β16Jun 23, 2022Updated 3 years ago
- https://icml.cc/virtual/2023/poster/24354β10Aug 15, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer β’ AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- The git repository of Modular Prompted Chatbot paperβ35May 24, 2023Updated 2 years ago
- Hide and Seek (HaS): A Framework for Prompt Privacy Protectionβ54Sep 6, 2023Updated 2 years ago
- Code for Findings of ACL 2021 "Differential Privacy for Text Analytics via Natural Text Sanitization"β33Mar 15, 2022Updated 4 years ago
- NLPCC-2025 Shared-Task 1: LLM-Generated Text Detectionβ16Apr 6, 2026Updated last month
- β19Mar 6, 2023Updated 3 years ago
- Machine learning project using federated learning for text generationβ11May 5, 2024Updated 2 years ago
- [ACL 2021] Learning to Perturb Word Embeddings for Out-of-distribution QAβ16May 11, 2022Updated 4 years ago
- Bayesian Active Learning with Fully Bayesian Gaussian Processesβ14Sep 29, 2022Updated 3 years ago
- π§π» Code and benchmark for our Findings of ACL 2024 paper - "TimeChara: Evaluating Point-in-Time Character Hallucination of Role-Playingβ¦β21Dec 20, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer β’ AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- β21Sep 21, 2021Updated 4 years ago
- Code for paper: "RemovalNet: DNN model fingerprinting removal attack", IEEE TDSC 2023.β10Nov 27, 2023Updated 2 years ago
- CopyBench: Measuring Literal and Non-Literal Reproduction of Copyright-Protected Text in Language Model Generationβ14Aug 19, 2025Updated 9 months ago
- Implementation of Self-supervised-Online-Adversarial-Purificationβ13Aug 2, 2021Updated 4 years ago
- [SatML 2024] Shake to Leak: Fine-tuning Diffusion Models Can Amplify the Generative Privacy Riskβ15Mar 15, 2025Updated last year
- π€ Code for our EMNLP 2021 paper: "Perspective-taking and Pragmatics for Generating Empathetic Responses Focused on Emotion Causes"β76Mar 22, 2022Updated 4 years ago
- Code for ICLR 2025 Paper "GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-time Alignment"β23Feb 10, 2025Updated last year
- β13Oct 21, 2021Updated 4 years ago
- Flow Integrity Deterministic Enforcement System. Mechanisms for securing AI agents with information-flow control.β92May 30, 2025Updated 11 months ago
- Deploy open-source AI quickly and easily - Special Bonus Offer β’ AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- β29Aug 31, 2025Updated 8 months ago
- The official pytorch implementation of ACM MM 19 paper "MetaAdvDet: Towards Robust Detection of Evolving Adversarial Attacks"β11Jun 7, 2021Updated 4 years ago
- Official repo to reproduce the paper "How to Backdoor Diffusion Models?" published at CVPR 2023β95Sep 17, 2025Updated 8 months ago
- [NeurIPS'22] Trap and Replace: Defending Backdoor Attacks by Trapping Them into an Easy-to-Replace Subnetwork. Haotao Wang, Junyuan Hong,β¦β14Nov 27, 2023Updated 2 years ago
- GhostSuite (Official Codebase for "Data Shapley in One Training Run", ICLR'25)β35Jan 16, 2026Updated 4 months ago
- Prediction Poisoning: Towards Defenses Against DNN Model Stealing Attacks (ICLR '20)β33Nov 4, 2020Updated 5 years ago
- [EMNLP 2022] Distillation-Resistant Watermarking (DRW) for Model Protection in NLPβ13Aug 17, 2023Updated 2 years ago