☆16Jul 23, 2024Updated last year
Alternatives and similar repositories for refdpo
Users that are interested in refdpo are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implementation for the paper "Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning"☆11Jan 10, 2025Updated last year
- Visual and Embodied Concepts evaluation benchmark☆21Oct 10, 2023Updated 2 years ago
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆39Jan 12, 2024Updated 2 years ago
- [NeurIPS 2024] Can LLMs Learn by Teaching for Better Reasoning? A Preliminary Study☆59Nov 24, 2024Updated last year
- This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"☆17Feb 22, 2024Updated 2 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆37Sep 16, 2024Updated last year
- Codebase for Instruction Following without Instruction Tuning☆36Sep 24, 2024Updated last year
- ☆12Apr 25, 2025Updated last year
- The Code and Script of "David's Slingshot: A Strategic Coordination Framework of Small LLMs Matches Large LLMs in Data Synthesis"☆34Jun 13, 2025Updated 10 months ago
- A research project exploring fine-tuning BERT-style models for text generation☆40Nov 30, 2025Updated 5 months ago
- FastCuRL: Curriculum Reinforcement Learning with Stage-wise Context Scaling for Efficient LLM Reasoning (EMNLP 2025)☆59Oct 10, 2025Updated 6 months ago
- Official repo of Knowledge or Reasoning? A Close Look at How LLMs Think Across Domains.☆43Jun 6, 2025Updated 11 months ago
- Code for paper: Long cOntext aliGnment via efficient preference Optimization☆24Oct 10, 2025Updated 6 months ago
- Co-LLM: Learning to Decode Collaboratively with Multiple Language Models☆127May 7, 2024Updated 2 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- ☆15Apr 11, 2024Updated 2 years ago
- Set-Encoder: Permutation-Invariant Inter-Passage Attention for Listwise Passage Re-Ranking with Cross-Encoders☆18May 23, 2025Updated 11 months ago
- Official code and data repository of MathChat: MathChat: Benchmarking Mathematical Reasoning and Instruction Following in Multi-Turn Inte…☆22Jun 3, 2024Updated last year
- Learning to Skip the Middle Layers of Transformers☆17Aug 7, 2025Updated 9 months ago
- Is In-Context Learning Sufficient for Instruction Following in LLMs? [ICLR 2025]☆32Jan 23, 2025Updated last year
- The code for "VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by VIdeo SpatioTemporal Augmentation" [CVPR2025]☆21Feb 27, 2025Updated last year
- [NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623☆118Sep 26, 2024Updated last year
- ACL24☆11Jun 7, 2024Updated last year
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆62Aug 30, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- The code and data for the paper JiuZhang3.0☆49May 26, 2024Updated last year
- ☆35Sep 14, 2024Updated last year
- In-Context Alignment: Chat with Vanilla Language Models Before Fine-Tuning☆34Aug 9, 2023Updated 2 years ago
- ☆23Jul 5, 2024Updated last year
- LMM for VQA, tcsvt version☆10Jul 19, 2024Updated last year
- ☆34Apr 1, 2025Updated last year
- Dateset Reset Policy Optimization☆31Apr 12, 2024Updated 2 years ago
- ☆34Nov 18, 2025Updated 5 months ago
- ☆16Jul 17, 2025Updated 9 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- [COLM 2025] Official code for "When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoni…☆15Oct 31, 2025Updated 6 months ago
- [NeurIPS 2025 D&B (Spotlight🌟)] TIME: A Multi-level Benchmark for Temporal Reasoning of LLMs in Real-World Scenario☆31Oct 5, 2025Updated 7 months ago
- ☆16Jul 10, 2023Updated 2 years ago
- Official implementation of Bootstrapping Language Models via DPO Implicit Rewards☆47Apr 15, 2025Updated last year
- ☆22Oct 22, 2024Updated last year
- AdaICL: Which Examples to Annotate of In-Context Learning? Towards Effective and Efficient Selection☆19Oct 30, 2023Updated 2 years ago
- Official repository for paper: O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning☆98Feb 21, 2025Updated last year