Introducing Filtered Direct Preference Optimization (fDPO) that enhances language model alignment with human preferences by discarding lower-quality samples compared to those generated by the learning model
☆16Nov 27, 2024Updated last year
Alternatives and similar repositories for filtered-dpo
Users that are interested in filtered-dpo are comparing it to the libraries listed below
Sorting:
- ☆12Jan 2, 2024Updated 2 years ago
- Using conversational games to evaluate powerful LLMs☆18Sep 3, 2023Updated 2 years ago
- Self-Supervised Alignment with Mutual Information☆20May 24, 2024Updated last year
- ☆22Aug 10, 2022Updated 3 years ago
- ☆31Mar 23, 2024Updated last year
- [EMNLP 2023] ALCUNA: Large Language Models Meet New Knowledge☆29Oct 30, 2023Updated 2 years ago
- forked from DongZhouGu/arxiv-daily☆22Nov 8, 2022Updated 3 years ago
- Big Data Analysis of Tinder done at Universitat Rovira i Virgili and Universitat Politècnica de Catalunya · BarcelonaTech☆13Jan 3, 2023Updated 3 years ago
- Understanding the correlation between different LLM benchmarks☆29Jan 11, 2024Updated 2 years ago
- A tool to paste Excel ranges to Reddit☆11Sep 20, 2025Updated 5 months ago
- [ICLR 2025] Code for the paper "Implicit Search via Discrete Diffusion: A Study on Chess"☆37Mar 3, 2025Updated last year
- ☆31Dec 19, 2023Updated 2 years ago
- The first OpenSource Mafia Bot!☆10Oct 5, 2023Updated 2 years ago
- Multiprocessing in python☆10Aug 20, 2021Updated 4 years ago
- [ACL 2025 Findings] Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts (As Huggingface Daily Papers: …☆90Nov 23, 2025Updated 3 months ago
- WavReward: Spoken Dialogue Models With Generalist Reward Evaluators☆54May 15, 2025Updated 9 months ago
- Official Implementation of "Personalized Pieces: Efficient Personalized Large Language Models through Collaborative Efforts" at EMNLP 202…☆13Oct 27, 2024Updated last year
- DNH Werewolf Discord bot☆13Dec 19, 2024Updated last year
- 记录有用的Git repos☆12Jul 28, 2024Updated last year
- Open-source Human Feedback Library☆11Oct 25, 2023Updated 2 years ago
- YouTube Assistant☆12May 15, 2023Updated 2 years ago
- 小鸡词典🐤的Alfred🎩插件 咯咯咯☆11Apr 19, 2023Updated 2 years ago
- 李鲁鲁老师的 Copilot-Python 学习。和ChatGPT等大语言模型协同进化。☆10Jun 3, 2025Updated 9 months ago
- Code for Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model☆13Feb 15, 2024Updated 2 years ago
- MTalk-Bench: Evaluating Speech-to-Speech Models in Multi-Turn Dialogues via Arena-style and Rubrics Protocols☆17Nov 19, 2025Updated 3 months ago
- TOD-Flow: Modeling the Structure of Task-Oriented Dialogues☆13Feb 7, 2024Updated 2 years ago
- Inspirational post ids collected from Reddit using pushift.io and RoBERTa☆10Jan 18, 2024Updated 2 years ago
- Documentation at☆14Mar 27, 2025Updated 11 months ago
- Official Implementation of "Learning to Refuse: Towards Mitigating Privacy Risks in LLMs"☆10Dec 13, 2024Updated last year
- This is code for the EMNLP 2022 Paper "UniRPG: Unified Discrete Reasoning over Table and Text as Program Generation".☆10Apr 30, 2023Updated 2 years ago
- ☆10May 19, 2024Updated last year
- ☆38Oct 2, 2024Updated last year
- 百度QA100万数据集☆45Nov 30, 2023Updated 2 years ago
- A tool for converting FERC filings published in XBRL into SQLite databases☆15Feb 24, 2026Updated last week
- PyTorch implementation for PaLM: A Hybrid Parser and Language Model.☆10Jan 7, 2020Updated 6 years ago
- This is a fork of optimization part of RISO project (http://riso.sourceforge.net/)☆13Aug 30, 2015Updated 10 years ago
- ☆12Oct 5, 2022Updated 3 years ago
- Vapoursynth Python scripts☆11Feb 7, 2026Updated 3 weeks ago
- ☆10Oct 6, 2021Updated 4 years ago