AIM-Intelligence/RepBend

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/AIM-Intelligence/RepBend)

AIM-Intelligence / RepBend

Code for Representation Bending Paper

☆16

Alternatives and similar repositories for RepBend

Users that are interested in RepBend are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ethz-spylab / unlearning-vs-safety
View on GitHub
☆27Oct 6, 2024Updated last year
HAE-RAE / HAERAE-VISION
View on GitHub
Evaluation code for HAERAE-Vision benchmark
☆15Apr 29, 2026Updated 2 months ago
ys-zong / VLGuard
View on GitHub
[ICML 2024] Safety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Models.
☆90Jan 19, 2025Updated last year
locuslab / acr-memorization
View on GitHub
☆41Dec 19, 2024Updated last year
AIM-Intelligence / COMPASS
View on GitHub
COMPASS: A Framework for Evaluating Organization-Specific Policy Alignment in LLMs
☆17Apr 7, 2026Updated 3 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
chrisliu298 / llm-unlearn-eco
View on GitHub
[NeurIPS 2024] Large Language Model Unlearning via Embedding-Corrupted Prompts
☆41Sep 26, 2024Updated last year
HITsz-TMG / ICL-State-Vector
View on GitHub
☆12Jul 4, 2024Updated 2 years ago
ariahw / rl-rewardhacking
View on GitHub
☆44Feb 18, 2026Updated 5 months ago
dmis-lab / ETHIC
View on GitHub
[NAACL 2025] ETHIC: Evaluating Large Language Models on Long-Context Tasks with High Information Coverage
☆16Sep 2, 2025Updated 10 months ago
skai-research / ScholarEval
View on GitHub
Official code and data for the paper "ScholarEval: Research Idea Evaluation Grounded in Literature."
☆20Oct 28, 2025Updated 8 months ago
demegire / Parameterization-of-Hypercomplex-Multiplications
View on GitHub
This is a reproduction of the paper 'Beyond Fully-Connected Layers with Quaternions: Parameterization of Hypercomplex Multiplications wit…
☆13Aug 22, 2021Updated 4 years ago
RUCBM / ICLEval
View on GitHub
☆14Jun 24, 2024Updated 2 years ago
nishadsinghi / sc-genrm-scaling
View on GitHub
[COLM 2025] Official code for "When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoni…
☆15Oct 31, 2025Updated 8 months ago
ethz-spylab / misleading-privacy-evals
View on GitHub
Official code for "Evaluations of Machine Learning Privacy Defenses are Misleading" (https://arxiv.org/abs/2404.17399)
☆13Apr 29, 2024Updated 2 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
dmis-lab / ChroKnowledge
View on GitHub
[ICLR 2025] ChroKnowledge: Unveiling Chronological Knowledge of Language Models in Multiple Domains
☆17Mar 4, 2025Updated last year
poloclub / llm-landscape
View on GitHub
NeurIPS'24 - LLM Safety Landscape
☆40Oct 21, 2025Updated 9 months ago
awwang10 / sphinx
View on GitHub
☆14Oct 23, 2025Updated 8 months ago
LINs-lab / ELICIT
View on GitHub
[ICLR 2025] ELICIT: LLM Augmentation Via External In-context Capability
☆14Mar 11, 2025Updated last year
peer-preservation / main
View on GitHub
Code for the paper "Peer-Preservation in Frontier Models"
☆36Jul 2, 2026Updated 2 weeks ago
anadim / the-little-retrieval-test
View on GitHub
☆35Jun 21, 2023Updated 3 years ago
chenzhiliang94 / convo-plan-SCOPE
View on GitHub
SCOPE ICLR 2025
☆23Oct 3, 2025Updated 9 months ago
AIM-Intelligence / Automated-Multi-Turn-Jailbreaks
View on GitHub
☆139Dec 3, 2025Updated 7 months ago
night-chen / DyGen
View on GitHub
[KDD'23] This is the code repo for our KDD'23 paper "DyGen: Learning from Noisy Labels via Dynamics-Enhanced Generative Modeling".
☆11Jun 14, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
LiuAmber / RAHF
View on GitHub
[ACL 2024 main] Aligning Large Language Models with Human Preferences through Representation Engineering (https://aclanthology.org/2024.…
☆28Sep 25, 2024Updated last year
princeton-polaris-lab / Evaluating-Durable-Safeguards
View on GitHub
[ICLR 2025] On Evluating the Durability of Safegurads for Open-Weight LLMs
☆13Jun 20, 2025Updated last year
rtaori / data_feedback
View on GitHub
Code for the paper "Data Feedback Loops: Model-driven Amplification of Dataset Biases"
☆18Sep 9, 2022Updated 3 years ago
ai-agi / LLMs-Enhanced-Long-Text-Generation-Survey
View on GitHub
Long Form NLG Generation Based on Large Language Models
☆17Jan 31, 2024Updated 2 years ago
Re-Align / AlignTDS
View on GitHub
Analyzing LLM Alignment via Token distribution shift
☆17Jan 26, 2024Updated 2 years ago
UCSC-REAL / FLAT
View on GitHub
[ICLR 2025] FLAT: LLM Unlearning via Loss Adjustment with Only Forget Data
☆14Feb 26, 2025Updated last year
ContinualAI / clvision-challenge-2024
View on GitHub
5th CLVISION workshop at CVPR: repo for the challenge
☆19May 13, 2024Updated 2 years ago
j6mes / acl2021-factual-error-correction
View on GitHub
ACL 2021
☆26May 24, 2022Updated 4 years ago
john-hewitt / model-editing-canonical-examples
View on GitHub
☆14Feb 12, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
UCSB-NLP-Chang / causal_unlearn
View on GitHub
[EMNLP 2024] "Revisiting Who's Harry Potter: Towards Targeted Unlearning from a Causal Intervention Perspective"
☆35Jul 22, 2024Updated last year
phycholosogy / RAG-privacy
View on GitHub
The code for paper "The Good and The Bad: Exploring Privacy Issues in Retrieval-Augmented Generation (RAG)", exploring the privacy risk o…
☆67Feb 1, 2025Updated last year
Heidelberg-NLP / CC-SHAP-VLM
View on GitHub
Official code implementation for the paper "Do Vision & Language Decoders use Images and Text equally? How Self-consistent are their Expl…
☆12Jul 14, 2026Updated last week
ReyonRen / MFGAN
View on GitHub
☆21May 21, 2021Updated 5 years ago
alexzhou907 / dialogue_evaluation
View on GitHub
☆22Dec 8, 2022Updated 3 years ago
M0rtzz / zzu-resume-template
View on GitHub
郑州大学（ZZU）简历 LaTeX 模板
☆15Jan 11, 2026Updated 6 months ago
likenneth / dialogue_action_token
View on GitHub
Dialogue Action Tokens: Steering Language Models in Goal-Directed Dialogue with a Multi-Turn Planner
☆31Jun 27, 2024Updated 2 years ago