dpaleka/stealing-part-lm-supplementary

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/dpaleka/stealing-part-lm-supplementary)

dpaleka / stealing-part-lm-supplementary

Some code for "Stealing Part of a Production Language Model"

☆23

Alternatives and similar repositories for stealing-part-lm-supplementary

Users that are interested in stealing-part-lm-supplementary are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ShuchiWu / RDA
View on GitHub
☆11Dec 23, 2024Updated last year
jpwahle / emnlp23-paraphrase-types
View on GitHub
The official implementation of the EMNLP 2023 paper "Paraphrase Types for Generation and Detection"
☆12Oct 20, 2024Updated last year
VAIV-2023 / RLHF-Korean-Friendly-LLM
View on GitHub
Developing a Korean LLM model : Hate Speech Filtering, Improving conversational skills, Finetuning with the RLHF method
☆19May 27, 2025Updated last year
fra31 / rlhf-trojan-competition-submission
View on GitHub
☆19Feb 25, 2024Updated 2 years ago
matthewwicker / SafeCV
View on GitHub
Vision based algorithms for falsification of convolutional neural networks
☆12Jan 25, 2018Updated 8 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
ckkissane / sae-transfer
View on GitHub
Code to reproduce key results accompanying "SAEs (usually) Transfer Between Base and Chat Models"
☆13Jul 18, 2024Updated 2 years ago
nomadcoders / instaclone-backend
View on GitHub
Instaclone Backend built with Prisma and GraphQL.
☆17Mar 18, 2021Updated 5 years ago
Sandy-Zeng / NPAttack
View on GitHub
Pytorch implementation of NPAttack
☆12Jul 7, 2020Updated 6 years ago
CharlieMat / PivotCVAE
View on GitHub
This is the implementation code for the WWW2021 paper "Variation Control and Evaluation for Generative Slate Recommendation"
☆15Jun 7, 2021Updated 5 years ago
sisinflab / amlrecsys-tutorial
View on GitHub
Tutorial by Vito Walter Anelli, Yashar Deldjoo, Tommaso Di Noia and Felice Antonio Merra about Adversarial Machine Learning in Recommende…
☆25Apr 12, 2021Updated 5 years ago
ydc123 / MMP-Attack
View on GitHub
Official repository for "On the Multi-modal Vulnerability of Diffusion Models"
☆17Jul 15, 2024Updated 2 years ago
zlh-thu / StealingVerification
View on GitHub
Defending against Model Stealing via Verifying Embedded External Features
☆38Feb 19, 2022Updated 4 years ago
Jielin-Qiu / MMWatermark-Robustness
View on GitHub
Evaluating Durability: Benchmark Insights into Multimodal Watermarking
☆12Jun 7, 2024Updated 2 years ago
RunpeiDong / DGMS
View on GitHub
[ICML 2022 Spotlight] Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks
☆11May 21, 2023Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
YisenWang / ONL
View on GitHub
Code for CVPR2018 "Iterative Learning with Open-set Noisy Labels"
☆12Mar 12, 2021Updated 5 years ago
ethz-spylab / rlhf_trojan_competition
View on GitHub
Finding trojans in aligned LLMs. Official repository for the competition hosted at SaTML 2024.
☆119Jun 13, 2024Updated 2 years ago
LTS4 / neural-anisotropy-directions
View on GitHub
Source code for "Neural Anisotropy Directions"
☆16Nov 17, 2020Updated 5 years ago
Manu21JC / DataElixir
View on GitHub
[AAAI 2024] DataElixir: Purifying Poisoned Dataset to Mitigate Backdoor Attacks via Diffusion Models
☆12Dec 5, 2024Updated last year
labring / deck
View on GitHub
sealos deck
☆14Mar 30, 2024Updated 2 years ago
ys-zong / FoolyourVLLMs
View on GitHub
[ICML 2024] Fool Your (Vision and) Language Model With Embarrassingly Simple Permutations
☆15Oct 28, 2023Updated 2 years ago
junhe / wiser
View on GitHub
A fast text search engine built for SSDs, written in C++.
☆11Aug 29, 2022Updated 3 years ago
JJ-Vice / BAGM
View on GitHub
All code and data necessary to replicate experiments in the paper BAGM: A Backdoor Attack for Manipulating Text-to-Image Generative Model…
☆13Sep 16, 2024Updated last year
meta-metrics / metametrics
View on GitHub
Accepted to ICLR 2025. MetaMetrics is a calibrated meta-metric designed to evaluate generation tasks across different modalities aligned …
☆15Dec 30, 2024Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
Programming-With-Love / bluebell
View on GitHub
一个gin+vue的前后端分离项目,如果有帮助到您关于gin的学习,可以打赏一杯咖啡~
☆13Nov 6, 2020Updated 5 years ago
unbiarirang / Fixed-Input-Parameterization
View on GitHub
This repository contains the official code for the paper: "Prompt Injection: Parameterization of Fixed Inputs"
☆32Sep 13, 2024Updated last year
EternityYW / BiasEval-LLM-MentalHealth
View on GitHub
Unveiling and Mitigating Bias in Mental Health Analysis with Large Language Models
☆12Jun 21, 2024Updated 2 years ago
guyuntian / CoT_benchmark
View on GitHub
Code for "Towards Revealing the Mystery behind Chain of Thought: a Theoretical Perspective"
☆21Jul 16, 2023Updated 3 years ago
dgl-prc / m_testing_adversatial_sample
View on GitHub
☆26May 27, 2020Updated 6 years ago
kztakemoto / simbaja
View on GitHub
All in How You Ask for It: Simple Black-Box Method for Jailbreak Attacks
☆17Apr 24, 2024Updated 2 years ago
Multiverse4FM / Multiverse
View on GitHub
☆88Jun 16, 2025Updated last year
ArjunPanickssery / self_recognition
View on GitHub
☆10May 17, 2024Updated 2 years ago
AISG-Technology-Team / GCSS-Track-1A-Submission-Guide
View on GitHub
Submission Guide + Discussion Board for AI Singapore Global Challenge for Safe and Secure LLMs (Track 1A).
☆16Jul 4, 2024Updated 2 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
DABANGG-Attack / Source-Code
View on GitHub
Source code for DABANGG attack.
☆10Mar 26, 2022Updated 4 years ago
MadryLab / rla
View on GitHub
Residue Level Alignment
☆22Nov 21, 2024Updated last year
CryptoAILab / MergeGuard
View on GitHub
[CCS-LAMPS'24] LLM IP Protection Against Model Merging
☆16Oct 14, 2024Updated last year
KYRIE-LI11 / VideoMark
View on GitHub
☆23Aug 23, 2025Updated 10 months ago
PKU-ML / AdvNotRealFeatures
View on GitHub
Official Code for reproductivity of the NeurIPS 2023 paper: Adversarial Examples Are Not Real Features
☆16Jun 27, 2024Updated 2 years ago
thu-coai / DiaSafety
View on GitHub
This repo is for the paper: On the Safety of Conversational Models: Taxonomy, Dataset, and Benchmark
☆25Aug 13, 2022Updated 3 years ago
PKU-ML / DYNACL
View on GitHub
[ICLR 2023] Official repository of the paper "Rethinking the Effect of Data Augmentation in Adversarial Contrastive Learning"
☆19Feb 19, 2023Updated 3 years ago