CyberAgentAILab / filtered-dpoView external linksLinks
Introducing Filtered Direct Preference Optimization (fDPO) that enhances language model alignment with human preferences by discarding lower-quality samples compared to those generated by the learning model
☆16Nov 27, 2024Updated last year
Alternatives and similar repositories for filtered-dpo
Users that are interested in filtered-dpo are comparing it to the libraries listed below
Sorting:
- Code of "Regularized Best-of-N Sampling with Minimum Bayes Risk Objective for Language Model Alignment" (2025).☆14Apr 4, 2025Updated 10 months ago
- ☆12Jan 2, 2024Updated 2 years ago
- Using conversational games to evaluate powerful LLMs☆18Sep 3, 2023Updated 2 years ago
- [ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization☆96Aug 20, 2024Updated last year
- Self-Supervised Alignment with Mutual Information☆20May 24, 2024Updated last year
- ☆22Aug 10, 2022Updated 3 years ago
- ☆31Mar 23, 2024Updated last year
- [EMNLP 2023] ALCUNA: Large Language Models Meet New Knowledge☆29Oct 30, 2023Updated 2 years ago
- forked from DongZhouGu/arxiv-daily☆22Nov 8, 2022Updated 3 years ago
- Big Data Analysis of Tinder done at Universitat Rovira i Virgili and Universitat Politècnica de Catalunya · BarcelonaTech☆13Jan 3, 2023Updated 3 years ago
- [ICLR 2025] Code for the paper "Implicit Search via Discrete Diffusion: A Study on Chess"☆35Mar 3, 2025Updated 11 months ago
- ☆31Dec 19, 2023Updated 2 years ago
- Multiprocessing in python☆10Aug 20, 2021Updated 4 years ago
- The first OpenSource Mafia Bot!☆10Oct 5, 2023Updated 2 years ago
- Comparative Study and Implementation of Five Factor Model and Myers-Briggs Type Indicator Model☆11Sep 28, 2023Updated 2 years ago
- Our data munging code.☆34Oct 13, 2025Updated 4 months ago
- [ACL 2025 Findings] Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts (As Huggingface Daily Papers: …☆90Nov 23, 2025Updated 2 months ago
- WavReward: Spoken Dialogue Models With Generalist Reward Evaluators☆54May 15, 2025Updated 8 months ago
- DNH Werewolf Discord bot☆13Dec 19, 2024Updated last year
- Documentation at☆14Mar 27, 2025Updated 10 months ago
- 记录有用的Git repos☆12Jul 28, 2024Updated last year
- Dataset and codes for SEntFiN☆10May 31, 2023Updated 2 years ago
- Code for Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model☆13Feb 15, 2024Updated last year
- Open-source Human Feedback Library☆11Oct 25, 2023Updated 2 years ago
- TOD-Flow: Modeling the Structure of Task-Oriented Dialogues☆13Feb 7, 2024Updated 2 years ago
- ☆10May 19, 2024Updated last year
- YouTube Assistant☆12May 15, 2023Updated 2 years ago
- 小鸡词典🐤的Alfred🎩插件 咯咯咯☆11Apr 19, 2023Updated 2 years ago
- Official Implementation of "Learning to Refuse: Towards Mitigating Privacy Risks in LLMs"☆10Dec 13, 2024Updated last year
- Inspirational post ids collected from Reddit using pushift.io and RoBERTa☆10Jan 18, 2024Updated 2 years ago
- An implementation of MSSRM method☆11Mar 23, 2023Updated 2 years ago
- 百度QA100万数据集☆45Nov 30, 2023Updated 2 years ago
- Service for Bert model to Vector. 高效的文本转向量(Text-To-Vector)服务,支持GPU多卡、多worker、多客户端调用,开箱即用。☆12May 24, 2022Updated 3 years ago
- Asynchronous HTTP and WebSocket Server Library for (ESP32 + LwIP W5500). Now supporting using CString to save heap to send very large dat…☆13Dec 24, 2022Updated 3 years ago
- ☆12Oct 14, 2024Updated last year
- FamilyTool benchmark☆12Sep 10, 2025Updated 5 months ago
- Các thí nghiệm liên quan tới LLMs cho tiếng Việt (insprised by Physics of LLMs Series)☆11Oct 21, 2024Updated last year
- 练习题,python 协同过滤ALS模型实现:商品推荐 + 用户人群放大☆10Jun 4, 2020Updated 5 years ago
- 使用自然语言绘制流程图,基于OpenAI☆12Nov 13, 2023Updated 2 years ago