papers related to Direct Preference Optimization(DPO)
☆19Jul 16, 2024Updated last year
Alternatives and similar repositories for awesome-DPO
Users that are interested in awesome-DPO are comparing it to the libraries listed below
Sorting:
- Github repo for Microsoft hackathon 2024 Nov - https://microsoftfabric.devpost.com/?ref_content=default&ref_feature=challenge&ref_medium=…☆11Dec 30, 2024Updated last year
- ☆13Jul 15, 2024Updated last year
- [ICML 2023] Meta-SAGE: Scale Meta-Learning Scheduled Adaptation with Guided Exploration for Mitigating Scale Shift on Combinatorial Optim…☆10Dec 19, 2023Updated 2 years ago
- 使用Taro 开发的微信小程序,涉及微信登录,用户,地址,电话,下载图片到相册,等授权引导,主要功能:电商购买流程,地图标注等☆12Sep 4, 2020Updated 5 years ago
- ☆10May 16, 2021Updated 4 years ago
- a foot-activated webcam to show off your sneakers☆12Aug 4, 2021Updated 4 years ago
- A list of papers regarding generalization in (deep) reinforcement learning☆11Aug 13, 2023Updated 2 years ago
- A Pytorch implementation of Pensieve (SIGCOMM'18)☆12Jun 17, 2020Updated 5 years ago
- AI SaaS Companion with Next.js 13, React, Tailwind, Prisma, Stripe, PlanetScale, Upstash, Pinecone & Replicate API.☆17Aug 30, 2024Updated last year
- 《多模态大模型部署微调指南》快速部署/微调多模态大模型☆12Dec 4, 2024Updated last year
- ☆15May 22, 2025Updated 9 months ago
- ☆13Sep 24, 2023Updated 2 years ago
- AutoSkill: Experience-Driven Lifelong Learning via Skill Self-Evolution☆36Updated this week
- AttireAI - Conversational Fashion Outfit Generator☆11Sep 7, 2024Updated last year
- The official implementation of Preference Data Reward-Augmentation.☆18May 1, 2025Updated 10 months ago
- CEBaB: Estimating the Causal Effects of Real-World Concepts on NLP Model Behavior☆13Sep 21, 2022Updated 3 years ago
- ☆18Jun 30, 2023Updated 2 years ago
- Tracewright a regression test automation agent for Playwright☆32Mar 2, 2026Updated last week
- This project aims to build upon existing MGTBench project, extending its functionalities with the option to import and evaluate the bench…☆21Nov 5, 2024Updated last year
- ☆18Oct 8, 2024Updated last year
- Models, data, and codes for the paper: MetaAligner: Towards Generalizable Multi-Objective Alignment of Language Models☆24Sep 26, 2024Updated last year
- RLVR for LLMs in optimization modeling☆45Dec 17, 2025Updated 2 months ago
- Course Website for ICS Spring 2020 at Fudan University https://sunfloweraries.github.io/ICS-Spring20-Fudan/☆12May 15, 2020Updated 5 years ago
- [ACL 2025 Findings] Official implementation of the paper "Unveiling the Key Factors for Distilling Chain-of-Thought Reasoning".☆20Feb 26, 2025Updated last year
- All in How You Ask for It: Simple Black-Box Method for Jailbreak Attacks☆18Apr 24, 2024Updated last year
- The open-source materials for paper "Sparsing Law: Towards Large Language Models with Greater Activation Sparsity".☆30Nov 12, 2024Updated last year
- Official code implementation of SKU, Accepted by ACL 2024 Findings☆20Dec 18, 2024Updated last year
- ☆20Jan 6, 2023Updated 3 years ago
- Codebase for Paper Reusing Embeddings: Reproducible Reward Model Research in Large Language Model Alignment without GPUs☆22Apr 24, 2025Updated 10 months ago
- StableLM hosted on Runpod.io Serverless GPUs☆19Apr 14, 2024Updated last year
- ☆17Aug 1, 2025Updated 7 months ago
- This repository is the official implementation of the TrafficGamer.☆31Nov 22, 2024Updated last year
- Pytorch version of multi-view harmonized bilinear network for 3D object recognition☆23Dec 16, 2018Updated 7 years ago
- Resources and paper list for 'Scaling Environments for Agents'. This repository accompanies our survey on how environments contribute to …☆62Jan 28, 2026Updated last month
- ☆36Jul 2, 2025Updated 8 months ago
- ☆25Nov 5, 2025Updated 4 months ago
- Multi-agent environments library for simulating classic vehicle routing problems.☆30Feb 26, 2026Updated last week
- ☆21Aug 23, 2023Updated 2 years ago
- ☆23Feb 8, 2024Updated 2 years ago