[ICLR2025 Spotlight] Advantage-Guided Distillation for Preference Alignment in Small Language Models
☆26Feb 10, 2025Updated last year
Alternatives and similar repositories for ADPA
Users that are interested in ADPA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICLR 2025] Weighted-Reward Preference Optimization for Implicit Model Fusion☆14Mar 17, 2025Updated last year
- ☆19Apr 18, 2025Updated last year
- This project implements two dynamic spatiotemporal interpolation (DST) methods, i.e., coarse-grained DST (CGDST) and fine-grained DST (FG…☆11Apr 15, 2022Updated 4 years ago
- Official repository for "Boosting Adversarial Transferability using Dynamic Cues " (ICLR 2023)☆20Aug 24, 2023Updated 2 years ago
- ☆16Jun 25, 2025Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- PAL: Proxy-Guided Black-Box Attack on Large Language Models☆56Aug 17, 2024Updated last year
- ☆24Jul 25, 2024Updated last year
- Simulation and robot code for contact-rich household object insertion (ICRA 2023).☆24Dec 18, 2024Updated last year
- [ICML 2023] "Robust Weight Signatures: Gaining Robustness as Easy as Patching Weights?" by Ruisi Cai, Zhenyu Zhang, Zhangyang Wang☆16May 4, 2023Updated 3 years ago
- [ICML 2024] Code release for "On the Emergence of Cross-Task Linearity in Pretraining-Finetuning Paradigm"☆11Feb 20, 2025Updated last year
- Adapting LLaMA Decoder to Vision Transformer☆30May 20, 2024Updated 2 years ago
- Code for "AVG-LLaVA: A Multimodal Large Model with Adaptive Visual Granularity"☆33Oct 12, 2024Updated last year
- A python implementation of PROCLUS: PROjected CLUStering algorithm.☆10Jan 12, 2015Updated 11 years ago
- [Neurips’25] Code for the paper "Balanced Token Pruning: Accelerating Vision Language Models Beyond Local Optimization"☆32Sep 25, 2025Updated 9 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆12Sep 29, 2024Updated last year
- [ICLR 2022] "Sparsity Winning Twice: Better Robust Generalization from More Efficient Training" by Tianlong Chen*, Zhenyu Zhang*, Pengjun…☆40Mar 20, 2022Updated 4 years ago
- ☆47Apr 9, 2025Updated last year
- [NeurIPS 2024] Large Language Model Unlearning via Embedding-Corrupted Prompts☆40Sep 26, 2024Updated last year
- Code for generating a single image pretraining dataset☆13Aug 3, 2021Updated 4 years ago
- A2C is a special case of PPO!☆23May 20, 2022Updated 4 years ago
- Get labels' position to make faster r-cnn samples. 制作 faster r-cnn 样本☆12Nov 15, 2017Updated 8 years ago
- ☆35Dec 16, 2022Updated 3 years ago
- Implementation of our ICLR 2021 paper: Policy-Driven Attack: Learning to Query for Hard-label Black-box Adversarial Examples.☆11Mar 9, 2021Updated 5 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Submission Guide + Discussion Board for AI Singapore Global Challenge for Safe and Secure LLMs (Track 1A).☆16Jul 4, 2024Updated last year
- Code repo for the ICML 2021 paper "Making Paper Reviewing Robust to Bid Manipulation Attacks".☆10Sep 15, 2021Updated 4 years ago
- ☆41Feb 3, 2026Updated 4 months ago
- RENT (Reinforcement Learning via Entropy Minimization) is an unsupervised method for training reasoning LLMs.☆43Oct 31, 2025Updated 7 months ago
- Implementation of our NeurIPS 2019 paper: Subspace Attack: Exploiting Promising Subspaces for Query-Efficient Black-box Attacks☆10Dec 16, 2019Updated 6 years ago
- Official Implementation of CausalMoMa (RSS2023)☆28May 30, 2023Updated 3 years ago
- Experiments with reasoning models, training techniques, papers☆30Updated this week
- Cleaned test data list of DukeMTMC-reID, ICCV2021☆15Aug 26, 2021Updated 4 years ago
- HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models☆13Mar 6, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- This repo offers advanced tutorials for LLMs, BERT-based models, and multimodal models, covering fine-tuning, quantization, vocabulary ex…☆24May 5, 2025Updated last year
- Github repository for "Internalizing World Models via Self-Play Finetuning for Agentic RL"☆35Nov 1, 2025Updated 7 months ago
- 实现《Multiway Attention Networks for Modeling Sentence Pairs》中的网络模型,可用于问答,句子逻辑推理☆11Apr 13, 2020Updated 6 years ago
- Finding trojans in aligned LLMs. Official repository for the competition hosted at SaTML 2024.☆118Jun 13, 2024Updated 2 years ago
- Code of paper "HyperVLA: Efficient Inference in Vision-Language-Action Models via Hypernetworks"☆24Oct 8, 2025Updated 8 months ago
- ☆16Jun 26, 2021Updated 5 years ago
- Debiasing Through Data Attribution☆13May 23, 2024Updated 2 years ago