CodeUltraFeedback: aligning large language models to coding preferences (TOSEM 2025)
☆73Jun 25, 2024Updated last year
Alternatives and similar repositories for CodeUltraFeedback
Users that are interested in CodeUltraFeedback are comparing it to the libraries listed below
Sorting:
- Self-Supervised Alignment with Mutual Information☆20May 24, 2024Updated last year
- Implementation for the paper "Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning"☆11Jan 10, 2025Updated last year
- StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback☆74Aug 31, 2024Updated last year
- (NeurIPS '22) LISA: Learning Interpretable Skill Abstractions - A framework for unsupervised skill learning using Imitation☆29Feb 22, 2023Updated 3 years ago
- ☆160Nov 23, 2024Updated last year
- [EMNLP'23] Execution-Based Evaluation for Open Domain Code Generation☆49Dec 22, 2023Updated 2 years ago
- ☆16Jul 23, 2024Updated last year
- Training and Benchmarking LLMs for Code Preference.☆38Nov 15, 2024Updated last year
- Code to accompany the paper "The Information Geometry of Unsupervised Reinforcement Learning"☆20Oct 6, 2021Updated 4 years ago
- Self-Alignment with Principle-Following Reward Models☆169Sep 18, 2025Updated 5 months ago
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆40Jun 10, 2024Updated last year
- A scalable automated alignment method for large language models. Resources for "Aligning Large Language Models via Self-Steering Optimiza…☆20Nov 21, 2024Updated last year
- ☆20Nov 4, 2025Updated 3 months ago
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆39Jan 12, 2024Updated 2 years ago
- Source code of "Reasons to Reject? Aligning Language Models with Judgments"☆58Feb 29, 2024Updated 2 years ago
- Linear Attention Sequence Parallelism (LASP)☆88Jun 4, 2024Updated last year
- ☆342Jun 5, 2025Updated 8 months ago
- Lightweight Adapting for Black-Box Large Language Models☆25Feb 15, 2024Updated 2 years ago
- Utilities for efficient fine-tuning, inference and evaluation of code generation models☆21Oct 3, 2023Updated 2 years ago
- Directional Preference Alignment☆58Sep 23, 2024Updated last year
- Collection of papers for scalable automated alignment.☆93Oct 22, 2024Updated last year
- ☆30Jun 19, 2023Updated 2 years ago
- ☆44Jun 2, 2024Updated last year
- Neuron Activation☆26Nov 21, 2024Updated last year
- Accepted by Transactions on Machine Learning Research (TMLR)☆137Oct 5, 2024Updated last year
- 3rd placed submission to the NeurIPS MineRL competition 2019☆10Mar 24, 2023Updated 2 years ago
- Open-source repository for the OOPSLA'24 paper "CYCLE: Learning to Self-Refine Code Generation"☆10Mar 8, 2024Updated last year
- ☆11Mar 13, 2023Updated 2 years ago
- Code for "SCL-RAI: Span-based Contrastive Learning with Retrieval Augmented Inference for Unlabeled Entity Problem in NER" @COLING-2022☆11Aug 20, 2022Updated 3 years ago
- Source code for NeurIPS 2020 paper "Node Classification on Graphs with Few-Shot Novel Labels via Meta Transformed Network Embedding"☆10Nov 17, 2020Updated 5 years ago
- ☆11Jan 3, 2024Updated 2 years ago
- Pytorch implementation of HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models☆28Mar 22, 2024Updated last year
- Official implementation for "You Only Look at Screens: Multimodal Chain-of-Action Agents" (Findings of ACL 2024)☆255Jul 16, 2024Updated last year
- Fun project to run your own LLM chat bot using llama.cpp☆11Jun 9, 2023Updated 2 years ago
- SIFT: Grounding LLM Reasoning in Contexts via Stickers☆57Mar 6, 2025Updated 11 months ago
- ☆12Jul 8, 2023Updated 2 years ago
- Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning☆11Jun 16, 2022Updated 3 years ago
- ☆12Mar 23, 2025Updated 11 months ago
- ☆52Feb 12, 2025Updated last year