Personalized Steering of Large Language Models: Versatile Steering Vectors Through Bi-directional Preference Optimization
☆44Jul 28, 2024Updated last year
Alternatives and similar repositories for BiPO
Users that are interested in BiPO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Materials for "Multi-property Steering of Large Language Models with Dynamic Activation Composition"☆14Nov 22, 2024Updated last year
- Official code for ICML 2024 paper on Persona In-Context Learning (PICLe)☆27Jun 27, 2024Updated last year
- A resource repository for representation engineering in large language models☆150Nov 14, 2024Updated last year
- [ICLR 2025] General-purpose activation steering library☆160Sep 18, 2025Updated 6 months ago
- Steering Llama 2 with Contrastive Activation Addition☆222May 23, 2024Updated last year
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Official codebase for "Analyzing the Generalization and Reliability of Steering Vectors"☆20Dec 14, 2024Updated last year
- This is the official Gtihub repo for our paper: "BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction-tuned Lang…☆22Jul 3, 2024Updated last year
- ☆18Sep 1, 2025Updated 7 months ago
- Camouflage poisoning via machine unlearning☆19Jul 3, 2025Updated 9 months ago
- ☆15Mar 6, 2026Updated last month
- Code release for the paper "Style Vectors for Steering Generative Large Language Models", accepted to the Findings of the EACL 2024.☆36Sep 26, 2024Updated last year
- Github Repo for ICML 2022 paper: Communication-Efficient Adaptive Federated Learning☆10Nov 18, 2022Updated 3 years ago
- Repository for the paper: Aligning LLMs to Ask Good Questions A Case Study in Clinical Reasoning☆18Feb 21, 2025Updated last year
- Code for In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space Steering☆198Feb 13, 2025Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- code of paper "Defending Against Alignment-Breaking Attacks via Robustly Aligned LLM"☆14Nov 17, 2023Updated 2 years ago
- Model LEGO: Creating Models Like Disassembling and Assembling Building Blocks☆17Jan 15, 2025Updated last year
- ☆22Sep 5, 2025Updated 7 months ago
- Offical code of the paper Large Language Models Are Implicitly Topic Models: Explaining and Finding Good Demonstrations for In-Context Le…☆75Mar 20, 2024Updated 2 years ago
- Transformer Doctor: Diagnosing and Treating Vision Transformers☆11Jan 15, 2025Updated last year
- [ICML 2024] Language Models Represent Beliefs of Self and Others☆35Sep 26, 2024Updated last year
- Algebraic value editing in pretrained language models☆70Nov 1, 2023Updated 2 years ago
- [COLM 2025] SEAL: Steerable Reasoning Calibration of Large Language Models for Free☆56Apr 6, 2025Updated last year
- Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)☆66Oct 18, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆36Aug 28, 2025Updated 7 months ago
- Code release for MPCViT accepted by ICCV 2023☆16Jan 6, 2025Updated last year
- Steering vectors for transformer language models in Pytorch / Huggingface☆148Feb 21, 2025Updated last year
- Code and results accompanying the paper "Refusal in Language Models Is Mediated by a Single Direction".☆370Jun 13, 2025Updated 10 months ago
- [EMNLP 2024] Quantize LLM to extremely low-bit, and finetune the quantized LLMs☆15Jul 18, 2024Updated last year
- This is the official repository for paper: cross-modal information flow in multimodal large language models☆43May 21, 2025Updated 10 months ago
- ☆20Feb 2, 2026Updated 2 months ago
- CIKM 2021 Full Paper: FedMatch: Federated Learning Over Heterogeneous Question Answering Data☆12Dec 14, 2021Updated 4 years ago
- ☆16Jul 20, 2023Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Evaluating methods to improve model transfer for intensive care unit models☆16Jul 6, 2023Updated 2 years ago
- ☆10Apr 16, 2024Updated 2 years ago
- ☆39Dec 19, 2024Updated last year
- A TinyStories LM with SAEs and transcoders☆14Apr 3, 2025Updated last year
- One-Pixel Shortcut: on the Learning Preference of Deep Neural Networks (ICLR 2023 Spotlight)☆14Sep 28, 2025Updated 6 months ago
- Shadow Attack, LiRA, Quantile Regression and RMIA implementations in PyTorch (Online version)☆14Nov 8, 2024Updated last year
- Code for Reducing Hallucinations in Vision-Language Models via Latent Space Steering☆107Nov 23, 2024Updated last year