AlignmentResearch/scaling-poisoning

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/AlignmentResearch/scaling-poisoning)

AlignmentResearch / scaling-poisoning

☆17

Alternatives and similar repositories for scaling-poisoning

Users that are interested in scaling-poisoning are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

JiiahaoXU / AlignIns
View on GitHub
[CVPR 2025] The official Pytorch implementation of AlignIns
☆20Dec 20, 2025Updated 7 months ago
wicai24 / DOOR-Alignment
View on GitHub
☆20Apr 7, 2025Updated last year
zhang-wei-chao / DC-PDD
View on GitHub
This repository presents the original implementation of Pretraining Data Detection for Large Language Models: A Divergence-based Calibrat…
☆23May 21, 2025Updated last year
DyMessi / VisCRA
View on GitHub
☆19Dec 23, 2025Updated 7 months ago
CGCL-codes / Gen-AF
View on GitHub
The implementation of our IEEE S&P 2024 paper "Securely Fine-tuning Pre-trained Encoders Against Adversarial Examples".
☆11Jun 28, 2024Updated 2 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
LiuAmber / RAHF
View on GitHub
[ACL 2024 main] Aligning Large Language Models with Human Preferences through Representation Engineering (https://aclanthology.org/2024.…
☆28Sep 25, 2024Updated last year
sohailahmedkhan / Phishing-Websites-Classification-using-Deep-Learning
View on GitHub
A detailed comparison of performance scores achieved by Machine Learning and Deep Learning algorithms on 3 different Phishing datasets. 3…
☆16Sep 17, 2019Updated 6 years ago
umd-huang-lab / VLM-Poisoning
View on GitHub
Code for Neurips 2024 paper "Shadowcast: Stealthy Data Poisoning Attacks Against Vision-Language Models"
☆61Jan 15, 2025Updated last year
SolidShen / BAIT
View on GitHub
🔥🔥🔥 Detecting hidden backdoors in Large Language Models with only black-box access
☆57Jun 2, 2025Updated last year
trucndt / ami
View on GitHub
Codebase for Active Membership Inference Attack under Local Differential Privacy in Federated Learning
☆16Feb 9, 2024Updated 2 years ago
jokersio-tsy / CroSel
View on GitHub
[CVPR 24] This is official implication for our paper: ''CroSel: Cross Selection of Confident Pseudo Labels for Partial-Label Learning''.
☆15Apr 27, 2025Updated last year
DependableSystemsLab / MIA_defense_HAMP
View on GitHub
Code for the paper "Overconfidence is a Dangerous Thing: Mitigating Membership Inference Attacks by Enforcing Less Confident Prediction" …
☆13Sep 6, 2023Updated 2 years ago
shuaizhao95 / ICLAttack
View on GitHub
ICL backdoor attack
☆17Nov 4, 2024Updated last year
NeuralSentinel / SafeInfer
View on GitHub
☆23Jan 14, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Htring / KGQAMedicine
View on GitHub
以疾病为中心的一定规模医药领域知识图谱的问答系统
☆17Jul 24, 2022Updated 4 years ago
sreevardhanreddi / django-materialized-views
View on GitHub
Creating and querying materialized views from Django.
☆11Aug 13, 2021Updated 4 years ago
Awenbocc / LLM-OOD
View on GitHub
☆14Jul 24, 2024Updated 2 years ago
SRI-CSL / Trinity-TrojAI
View on GitHub
This repository contains code developed by the SRI team for the IARPA/TrojAI program.
☆21Jul 1, 2021Updated 5 years ago
xirui-li / DrAttack
View on GitHub
Official implementation of paper: DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers
☆68Aug 25, 2024Updated last year
PositionalHidden / PositionalHidden
View on GitHub
To mitigate position bias in LLMs, especially in long-context scenarios, we scale only one dimension of LLMs, reducing position bias and …
☆12Jun 18, 2024Updated 2 years ago
PurduePAML / K-ARM_Backdoor_Optimization
View on GitHub
☆18Jun 15, 2021Updated 5 years ago
lancasterJie / FLAYER
View on GitHub
☆22Jan 26, 2026Updated 5 months ago
z4yx / ucore-thumips
View on GitHub
uCore MIPS32 porting
☆18Dec 16, 2019Updated 6 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
reml-lab / URSABench
View on GitHub
Codebase for our paper "URSABench: Comprehensive Benchmarking of Approximate Bayesian Inference Methods for Deep Neural Networks"
☆19Aug 27, 2022Updated 3 years ago
robertjkeck2 / EmoNet
View on GitHub
Audio-only Emotion Detection using Federated Learning
☆10Dec 8, 2022Updated 3 years ago
Gwinhen / DRUPE
View on GitHub
Distribution Preserving Backdoor Attack in Self-supervised Learning
☆20Jan 27, 2024Updated 2 years ago
warestack / platform
View on GitHub
A collection of of reusable workflows and composite actions to help developers kickstart their pipelines.
☆13Oct 11, 2024Updated last year
teobaluta / etio
View on GitHub
Causal Reasoning for Membership Inference Attacks
☆11Oct 21, 2022Updated 3 years ago
JiaweiLian / SRA
View on GitHub
NeurIPS 2025
☆19Oct 20, 2025Updated 9 months ago
LRudL / sad
View on GitHub
Situational Awareness Dataset
☆52Dec 14, 2024Updated last year
ShannonAI / backdoor_nlg
View on GitHub
☆18Jul 1, 2021Updated 5 years ago
ijl20 / cambridge_coffee_pot
View on GitHub
Real-time sensor for the Cambridge Coffee Pot (Computer Lab)
☆11Mar 24, 2020Updated 6 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
antibloch / mia_attacks
View on GitHub
Shadow Attack, LiRA, Quantile Regression and RMIA implementations in PyTorch (Online version)
☆14Nov 8, 2024Updated last year
rosieyzh / openrlhf-pretrain
View on GitHub
Code for "Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining"
☆29Oct 14, 2025Updated 9 months ago
zhanghang1989 / AutoGluon-Docker
View on GitHub
AutoGluon Docker
☆12Apr 17, 2020Updated 6 years ago
danielway / nexrad-volumetric-renderer
View on GitHub
Project exploring 3D volumetric rendering of NEXRAD radar data.
☆13Oct 23, 2023Updated 2 years ago
6lyc / FedCEO_Collaborate-with-Each-Other
View on GitHub
[ICML 2025] The Official implementation of our paper "Clients Collaborate: Flexible Differentially Private Federated Learning with Guaran…
☆23Dec 21, 2025Updated 7 months ago
Social-AI-Studio / MemeCraft
View on GitHub
Official repository for WWW'24 paper "MemeCraft: Contextual and Stance-Driven Multimodal Meme Generation"
☆12Jul 25, 2024Updated last year
scottshufe / Property-Inference-Attacks-Literature
View on GitHub
☆13Sep 26, 2024Updated last year