xqlin98 / INSTINCT
This is the official implementation for the paper: Use Your INSTINCT: INSTruction optimization usIng Neural bandits Coupled with Transformers
☆34Updated 3 months ago
Related projects: ⓘ
- [ICML 2024] Official code for the paper "Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark ".☆62Updated 2 months ago
- ☆38Updated 8 months ago
- Official Code for Paper: Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications☆55Updated 2 months ago
- Source code of "Task arithmetic in the tangent space: Improved editing of pre-trained models".☆79Updated last year
- Landing Page for TOFU☆79Updated 3 months ago
- ☆40Updated 10 months ago
- [NeurIPS 2023 Spotlight] Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training☆24Updated 9 months ago
- Official implementation of ICLR'24 paper, "Curiosity-driven Red Teaming for Large Language Models" (https://openreview.net/pdf?id=4KqkizX…☆57Updated 6 months ago
- ☆52Updated 8 months ago
- This is the official code for the paper "Vaccine: Perturbation-aware Alignment for Large Language Models"☆12Updated last week
- AdaMerging: Adaptive Model Merging for Multi-Task Learning. ICLR, 2024.☆40Updated 2 weeks ago
- Influence Analysis and Estimation - Survey, Papers, and Taxonomy☆58Updated 6 months ago
- DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and Diffusion Models (ICLR 2024)☆48Updated 5 months ago
- ☆20Updated 2 months ago
- ☆30Updated 7 months ago
- Code for safety test in "Keeping LLMs Aligned After Fine-tuning: The Crucial Role of Prompt Templates"☆16Updated 6 months ago
- This is the starter kit for the Trojan Detection Challenge 2023 (LLM Edition), a NeurIPS 2023 competition.☆77Updated 4 months ago
- Official repository of "Localizing Task Information for Improved Model Merging and Compression" [ICML 2024]☆27Updated 4 months ago
- `dattri` is a PyTorch library for developing, benchmarking, and deploying efficient data attribution algorithms.☆27Updated this week
- ☆24Updated last year
- ☆38Updated 6 months ago
- ☆61Updated 2 years ago
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆78Updated last week
- FusionBench: A Comprehensive Benchmark of Deep Model Fusion☆42Updated 2 weeks ago
- [ACL 2024] Shifting Attention to Relevance: Towards the Predictive Uncertainty Quantification of Free-Form Large Language Models☆22Updated 2 weeks ago
- Code for the ICML 2024 paper "Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment"☆38Updated last month
- [ICLR'24] RAIN: Your Language Models Can Align Themselves without Finetuning☆79Updated 3 months ago
- ☆16Updated 10 months ago
- Official Repository for The Paper: Safety Alignment Should Be Made More Than Just a Few Tokens Deep☆22Updated 2 months ago
- [ICML 2024] Safety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Models.☆36Updated last month