[NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$
☆50Oct 23, 2024Updated last year
Alternatives and similar repositories for beta-DPO
Users that are interested in beta-DPO are comparing it to the libraries listed below
Sorting:
- [ICML 2025] Official code of "AlphaDPO: Adaptive Reward Margin for Direct Preference Optimization"☆30Jan 10, 2026Updated last month
- ☆15Feb 26, 2025Updated last year
- official implementation of "CLIP-VQDiffusion : Langauge Free Training of Text To Image generation using CLIP and vector quantized diffusi…☆18Sep 5, 2024Updated last year
- Code for reproducing our paper "Low Rank Adapting Models for Sparse Autoencoder Features"☆17Mar 31, 2025Updated 11 months ago
- ☆10Jul 8, 2021Updated 4 years ago
- Modality Gap–Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models☆51Feb 23, 2026Updated last week
- Look Back to Reason Forward: Revisitable Memory for Long-Context LLM Agents☆23Feb 21, 2026Updated last week
- [ICML 2025] LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models☆17Nov 4, 2025Updated 4 months ago
- The official implementation of Diffusion Distillation With Direct Preference Optimization For Efficient 3D LiDAR Scene Completion [AAAI'2…☆15Feb 2, 2026Updated last month
- ☆19Updated this week
- [ICML 2025] Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment (https://arxiv.org/abs/2410.02197)☆39Sep 8, 2025Updated 5 months ago
- ☆46Jun 11, 2025Updated 8 months ago
- ☆31Oct 2, 2024Updated last year
- Code for EMNLP2023 paper "MolCA: Molecular Graph-Language Modeling with Cross-Modal Projector and Uni-Modal Adapter".☆12Dec 27, 2023Updated 2 years ago
- Aligning Agentic World Models via Knowledgeable Experience Learning☆31Jan 25, 2026Updated last month
- [ICLR 2026] Official repo for "Spotlight on Token Perception for Multimodal Reinforcement Learning"☆49Jan 30, 2026Updated last month
- KDD 2024 AQA competition 2nd place solution☆12Jul 21, 2024Updated last year
- [ICML'25] "Rethinking Addressing in Language Models via Contextualized Equivariant Positional Encoding" by Jiajun Zhu, Peihao Wang, Ruisi…☆14Jun 6, 2025Updated 8 months ago
- Improving upon state of the art cooperative deep reinforcement learning in StarCraft II☆13May 16, 2019Updated 6 years ago
- A research project exploring fine-tuning BERT-style models for text generation☆36Nov 30, 2025Updated 3 months ago
- Fun project to run your own LLM chat bot using llama.cpp☆11Jun 9, 2023Updated 2 years ago
- Codes for our paper "AgentMonitor: A Plug-and-Play Framework for Predictive and Secure Multi-Agent Systems"☆13Dec 13, 2024Updated last year
- Mixture-of-Basis-Experts for Compressing MoE-based LLMs☆29Dec 24, 2025Updated 2 months ago
- [ICLR'25] "Attention in Large Language Models Yields Efficient Zero-Shot Re-Rankers"☆40Mar 31, 2025Updated 11 months ago
- [CVPR 2024] PriViLege: Pre-trained Vision and Language Transformers Are Few-Shot Incremental Learners☆57Sep 5, 2024Updated last year
- ☆16May 13, 2025Updated 9 months ago
- [ICML 2025] Official code of "DAMA: Data- and Model-aware Alignment of Multi-modal LLMs"☆16May 24, 2025Updated 9 months ago
- CoV: Chain-of-View Prompting for Spatial Reasoning☆51Jan 23, 2026Updated last month
- [ICLR 2025] Weighted-Reward Preference Optimization for Implicit Model Fusion☆14Mar 17, 2025Updated 11 months ago
- AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories☆40Aug 7, 2025Updated 6 months ago
- Preference Learning for LLaVA☆59Nov 9, 2024Updated last year
- [NeurIPS 2024] The implementation of paper "On Softmax Direct Preference Optimization for Recommendation"☆96Nov 29, 2024Updated last year
- [NeurIPS 2025] Search and Refine During Think: Facilitating Knowledge Refinement for Improved Retrieval-Augmented Reasoning☆126Dec 13, 2025Updated 2 months ago
- Symmetrical Visual Contrastive Optimization: Aligning Vision-Language Models with Minimal Contrastive Images☆18Jun 4, 2025Updated 9 months ago
- CS194-196 Course Project☆14Feb 20, 2025Updated last year
- An End-to-End Model with Adaptive Filtering for Retrieval-Augmented Generation☆16Oct 27, 2024Updated last year
- Set-Encoder: Permutation-Invariant Inter-Passage Attention for Listwise Passage Re-Ranking with Cross-Encoders☆18May 23, 2025Updated 9 months ago
- CIKM Cup 2016 (1st Place) - Track 1 - Cross Device Entity Linking☆18Sep 19, 2017Updated 8 years ago
- FinMTEB: Finance Massive Text Embedding Benchmark (EMNLP 2025 Main)☆52Nov 15, 2025Updated 3 months ago