sahsaeedi / triple-preference-optimization
β18Updated 5 months ago
Related projects β
Alternatives and complementary repositories for triple-preference-optimization
- [ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimizationβ55Updated 3 months ago
- [NeurIPS-2024] π Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623β71Updated last month
- Enhancing Large Vision Language Models with Self-Training on Image Comprehension.β59Updated 5 months ago
- ICML 2024 - Official Repository for EXO: Towards Efficient Exact Optimization of Language Model Alignmentβ46Updated 5 months ago
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.β63Updated last month
- [ICLR 2024 Spotlight] Code for the paper "Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy"β64Updated 5 months ago
- [EMNLP Findings 2024 & ACL 2024 NLRSE Oral] Enhancing Mathematical Reasoning in Language Models with Fine-grained Rewardsβ44Updated 6 months ago
- AnchorAttention: Improved attention for LLMs long-context trainingβ142Updated this week
- Code release for "SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers"β40Updated last month
- [ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibratiβ¦β24Updated 4 months ago
- [NeurIPS 2024] Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"β84Updated 8 months ago
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervisionβ97Updated 2 months ago
- The official implementation of Self-Exploring Language Models (SELM)β55Updated 5 months ago
- [NeurIPS 2024] A task generation and model evaluation system for multimodal language models.β56Updated last month
- Code and data for the benchmark "Multimodal Needle in a Haystack (MMNeedle): Benchmarking Long-Context Capability of Multimodal Large Lanβ¦β34Updated 4 months ago
- β54Updated 2 months ago
- DiffuGPT and DiffuLLaMA: Scaling Diffusion Language Models via Adaptation from Autoregressive Modelsβ58Updated 3 weeks ago
- Implementation of the ICML 2024 paper "Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning" prβ¦β74Updated 9 months ago
- Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"β47Updated last month
- Interpretable Contrastive Monte Carlo Tree Search Reasoningβ24Updated 2 weeks ago
- β61Updated 9 months ago
- β31Updated 3 weeks ago
- Directional Preference Alignmentβ51Updated 2 months ago
- β27Updated last year
- [EMNLP 2023, Findings] GRACE: Discriminator-Guided Chain-of-Thought Reasoningβ44Updated last month
- Flow of Reasoning: Training LLMs for Divergent Problem Solving with Minimal Examplesβ39Updated last month
- Official implementation of Bootstrapping Language Models via DPO Implicit Rewardsβ39Updated 3 months ago
- β51Updated 7 months ago
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."β34Updated last month
- β90Updated 4 months ago