Dahoas / QDSyntheticDataLinks

☆14

Alternatives and similar repositories for QDSyntheticData

Users that are interested in QDSyntheticData are comparing it to the libraries listed below

Sorting:

casmlab / NPHardEval
Repository for NPHardEval, a quantified-dynamic benchmark of LLMs
☆56Updated last year
martin-wey / CodeUltraFeedback
CodeUltraFeedback: aligning large language models to coding preferences
☆71Updated last year
ctlllll / reward_collapse
☆27Updated 2 years ago
upiterbarg / lintseq
[ICLR 2025] "Training LMs on Synthetic Edit Sequences Improves Code Synthesis" (Piterbarg, Pinto, Fergus)
☆19Updated 5 months ago
samuelarnesen / nyu-debate-modeling
☆22Updated 9 months ago
cassidylaidlaw / orpo
☆18Updated 8 months ago
allenai / easy-to-hard-generalization
Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"
☆48Updated last year
GSYfate / knnlm-limits
Official code repo for paper "Great Memory, Shallow Reasoning: Limits of kNN-LMs"
☆23Updated 2 months ago
hughbzhang / o1_inference_scaling_laws
Replicating O1 inference-time scaling laws
☆89Updated 7 months ago
Asap7772 / understanding-rlhf
Learning from preferences is a common paradigm for fine-tuning language models. Yet, many algorithmic design decisions come into play. Ou…
☆29Updated last year
formll / resolving-scaling-law-discrepancies
☆20Updated last year
r-three / RAD
Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model
☆43Updated last year
ethz-spylab / superhuman-ai-consistency
☆29Updated 2 years ago
LoryPack / LLM-LieDetector
Code for the ICLR 2024 paper "How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions"
☆71Updated last year
ucl-dark / llm_debate
Code release for "Debating with More Persuasive LLMs Leads to More Truthful Answers"
☆112Updated last year
tml-epfl / icl-alignment
Is In-Context Learning Sufficient for Instruction Following in LLMs? [ICLR 2025]
☆31Updated 5 months ago
gregorbachmann / Next-Token-Failures
☆87Updated last year
zhaoxlpku / SubgoalXL
☆25Updated 10 months ago
crux-eval / eval-arena
☆28Updated last week
shunzh / Code-AI-Tree-Search
☆119Updated last year
janphilippfranken / sami
Self-Supervised Alignment with Mutual Information
☆20Updated last year
likenneth / q_probe
Q-Probe: A Lightweight Approach to Reward Maximization for Language Models
☆41Updated last year
Edward-Sun / easy-to-hard
Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
☆123Updated 10 months ago
zhangir-azerbayev / MetaMath
☆11Updated last year
yidingjiang / ado
The repository contains code for Adaptive Data Optimization
☆25Updated 7 months ago
JacobPfau / fillerTokens
☆66Updated last year
architsharma97 / dpo-rlaif
☆99Updated last year
StigLidu / TURN
[ICML2025] Official Repo for Paper "Optimizing Temperature for Language Models with Multi-Sample Inference"
☆21Updated 5 months ago
hamishivi / automated-instruction-selection
Exploration of automated dataset selection approaches at large scales.
☆46Updated 4 months ago
lingo-mit / lm-truthfulness
☆17Updated last year