architsharma97 / dpo-rlaifView external linksLinks
☆99Jun 27, 2024Updated last year
Alternatives and similar repositories for dpo-rlaif
Users that are interested in dpo-rlaif are comparing it to the libraries listed below
Sorting:
- Official implementation of "Gemini in Reasoning: Unveiling Commonsense in Multimodal Large Language Models"☆37Jan 3, 2024Updated 2 years ago
- [AAAI2025] ChatterBox: Multi-round Multimodal Referring and Grounding, Multimodal, Multi-round dialogues☆60May 2, 2025Updated 9 months ago
- Code for this paper "HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts via HyperNetwork"☆33Nov 29, 2023Updated 2 years ago
- A continually updated list of literature on Reinforcement Learning from AI Feedback (RLAIF)☆193Aug 6, 2025Updated 6 months ago
- Code and data for CoachLM, an automatic instruction revision approach LLM instruction tuning.☆60Mar 20, 2024Updated last year
- [Pattern Recognition 2024] Semantic-Aware Frame-Event Fusion based Pattern Recognition via Large Vision-Language Models, Dong Li, Jiandon…☆18Jan 18, 2025Updated last year
- The official implementation for Collaborative Word-based Pre-trained Item Representation for Transferable Recommendation.☆25Jan 30, 2024Updated 2 years ago
- ☆18Feb 20, 2024Updated last year
- The official implementation of Self-Play Fine-Tuning (SPIN)☆1,234May 8, 2024Updated last year
- The code of “Improving Weak-to-Strong Generalization with Scalable Oversight and Ensemble Learning”☆17Feb 26, 2024Updated last year
- Official repo for EMNLP 2023 paper "Explain-then-Translate: An Analysis on Improving Program Translation with Self-generated Explanations…☆29Dec 5, 2023Updated 2 years ago
- ☆29Jan 23, 2024Updated 2 years ago
- ☆29Dec 28, 2025Updated last month
- Learning from preferences is a common paradigm for fine-tuning language models. Yet, many algorithmic design decisions come into play. Ou…☆32Apr 20, 2024Updated last year
- ☆37Jan 20, 2024Updated 2 years ago
- Code for paper titled "Towards the Law of Capacity Gap in Distilling Language Models"☆102Jul 9, 2024Updated last year
- ☆34Sep 14, 2024Updated last year
- [NeurIPS XAIA & Springer] Code and notebooks to paper "A Fresh Look at Sanity Checks for Saliency Maps"☆25Jul 12, 2024Updated last year
- Simple replication of [ColBERT-v1](https://arxiv.org/abs/2004.12832).☆82Mar 18, 2024Updated last year
- 2D road segmentation using lidar data during training☆43Dec 21, 2023Updated 2 years ago
- A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity.☆85Mar 7, 2025Updated 11 months ago
- ☆144Sep 10, 2023Updated 2 years ago
- Backtracing: Retrieving the Cause of the Query, EACL 2024 Long Paper, Findings.☆92Jul 21, 2024Updated last year
- This repository is the project page for "Point Anywhere: Directed Object Estimation from Omnidirectional Images", including source code …☆12Aug 25, 2023Updated 2 years ago
- This repository contains a framework for converting monocular videos into side-by-side (SBS) 3D videos. It utilizes a combination of imag…☆90Feb 11, 2024Updated 2 years ago
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆23Mar 12, 2024Updated last year
- ☆25May 16, 2024Updated last year
- Official repository for ORPO☆471May 31, 2024Updated last year
- SIGIR 2021: Proactive Retrieval-based Chatbots based on Relevant Knowledge and Goals☆11Jul 30, 2021Updated 4 years ago
- MLFlow End to End Workshop at Chandigarh University☆11Feb 3, 2023Updated 3 years ago
- A simple python package to stretch audio files and change their speed☆12Jan 16, 2026Updated last month
- Code for Findings of ACL 2021 paper "Addressing Inquiries about History: An Efficient and Practical Framework for Evaluating Open-domain …☆19Dec 16, 2022Updated 3 years ago
- The official repository for the CodeGym project: "Generalizable End-to-End Tool-Use RL with Synthetic CodeGym"☆22Oct 14, 2025Updated 4 months ago
- Active Learning Helps Pretrained Models Learn the Intended Task (https://arxiv.org/abs/2204.08491) by Alex Tamkin, Dat Nguyen, Salil Desh…☆11Nov 22, 2022Updated 3 years ago
- ☆43Sep 10, 2025Updated 5 months ago
- Local LLM-based social network filter☆70Jan 31, 2024Updated 2 years ago
- ☆282Jan 6, 2025Updated last year
- Official implementation of MetaTree: Learning a Decision Tree Algorithm with Transformers☆114Sep 13, 2024Updated last year
- Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"☆48Jan 17, 2024Updated 2 years ago