martin-wey / CodeUltraFeedbackLinks

CodeUltraFeedback: aligning large language models to coding preferences (TOSEM 2025)

☆71

Alternatives and similar repositories for CodeUltraFeedback

Users that are interested in CodeUltraFeedback are comparing it to the libraries listed below

Sorting:

allenai / easy-to-hard-generalization
Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"
☆48Updated last year
casmlab / NPHardEval
Repository for NPHardEval, a quantified-dynamic benchmark of LLMs
☆59Updated last year
GAIR-NLP / scaleeval
Scalable Meta-Evaluation of LLMs as Evaluators
☆42Updated last year
architsharma97 / dpo-rlaif
☆100Updated last year
bigcode-project / astraios
Astraios: Parameter-Efficient Instruction Tuning Code Language Models
☆62Updated last year
hkust-nlp / llm-compression-intelligence
Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]
☆142Updated last year
WHGTyen / BIG-Bench-Mistake
A dataset of LLM-generated chain-of-thought steps annotated with mistake location.
☆82Updated last year
GAIR-NLP / OlympicArena
[NeurIPS 2024] OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI
☆105Updated 7 months ago
Ablustrund / APPS_Plus
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback
☆71Updated last year
IBM / SALMON
Self-Alignment with Principle-Following Reward Models
☆168Updated last month
john-hewitt / implicit-ins
Codebase for Instruction Following without Instruction Tuning
☆36Updated last year
rmshin / llm-mcts
☆41Updated last year
Asap7772 / understanding-rlhf
Learning from preferences is a common paradigm for fine-tuning language models. Yet, many algorithmic design decisions come into play. Ou…
☆32Updated last year
YuxiXie / SelfEval-Guided-Decoding
☆103Updated last year
austrian-code-wizard / c3po
☆29Updated 2 months ago
yidingjiang / ado
The repository contains code for Adaptive Data Optimization
☆26Updated 10 months ago
GAIR-NLP / AIME-Preview
☆73Updated 7 months ago
vwxyzjn / summarize_from_feedback_details
☆152Updated 11 months ago
hkust-nlp / B-STaR
B-STAR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners
☆85Updated 5 months ago
hamishivi / EasyLM
Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…
☆75Updated last year
THUDM / T1
RL Scaling and Test-Time Scaling (ICML'25)
☆111Updated 9 months ago
hamishivi / automated-instruction-selection
Exploration of automated dataset selection approaches at large scales.
☆47Updated 7 months ago
Re-Align / just-eval
A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.
☆87Updated last year
likenneth / dialogue_action_token
Dialogue Action Tokens: Steering Language Models in Goal-Directed Dialogue with a Multi-Turn Planner
☆28Updated last year
locuslab / scaling_laws_data_filtering
☆65Updated last year
Leooyii / LCEG
Long Context Extension and Generalization in LLMs
☆62Updated last year
google-deepmind / bbeh
☆99Updated 5 months ago
wwxu21 / CUT
Source code of "Reasons to Reject? Aligning Language Models with Judgments"
☆58Updated last year
Gen-Verse / CURE
[NeurIPS 2025 Spotlight] ReasonFlux-Coder: Open-Source LLM Coders with Co-Evolving Reinforcement Learning
☆126Updated last month
nyu-mll / ILF-for-code-generation
☆80Updated 7 months ago