microsoft / ConstrainedReasonerLinks

☆11

Alternatives and similar repositories for ConstrainedReasoner

Users that are interested in ConstrainedReasoner are comparing it to the libraries listed below

Sorting:

JayZhang42 / SLED
SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Model https://arxiv.org/pdf/2411.02433
☆26Updated 6 months ago
MurongYue / LLM_MoT_cascade
This is the implementation for the paper "LARGE LANGUAGE MODEL CASCADES WITH MIX- TURE OF THOUGHT REPRESENTATIONS FOR COST- EFFICIENT REA…
☆23Updated last year
GAIR-NLP / scaleeval
Scalable Meta-Evaluation of LLMs as Evaluators
☆42Updated last year
cambridgeltl / PairS
Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators (Liu et al.; COLM 2024)
☆47Updated 5 months ago
locuslab / scaling_laws_data_filtering
☆64Updated last year
Dereck0602 / Awesome_Test_Time_LLMs
☆109Updated 3 months ago
tianyang-x / SaySelf
Public code repo for paper "SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales"
☆106Updated 9 months ago
john-hewitt / implicit-ins
Codebase for Instruction Following without Instruction Tuning
☆34Updated 9 months ago
tatsu-lab / test_set_contamination
☆38Updated last year
Reason-Wang / NAT
[NAACL 2025] The official implementation of paper "Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language M…
☆26Updated last year
luka-group / vlm-knowledge-conflict
Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."
☆42Updated 8 months ago
r-three / RAD
Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model
☆43Updated last year
DynaMath / DynaMath
A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models
☆24Updated 7 months ago
DAMO-NLP-SG / LongPO
[ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization
☆37Updated 4 months ago
ernie-research / Tool-Augmented-Reward-Model
[ICLR'24 spotlight] Tool-Augmented Reward Modeling
☆50Updated 3 weeks ago
facebookresearch / RLCD
Reproduction of "RLCD Reinforcement Learning from Contrast Distillation for Language Model Alignment
☆69Updated last year
RUCAIBox / RLMEC
The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"
☆38Updated last year
sail-sg / Cheating-LLM-Benchmarks
[ICLR 2025] Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates (Oral)
☆79Updated 8 months ago
abertsch72 / long-context-icl
Data and code for the preprint "In-Context Learning with Long-Context Models: An In-Depth Exploration"
☆37Updated 10 months ago
chenyiqun / MMOA-RAG
This is the code of MMOA-RAG.
☆53Updated last month
yale-nlp / refdpo
☆16Updated 11 months ago
jwhj / OREO
☆114Updated 5 months ago
zjunlp / unlearn
[ACL 2025] Knowledge Unlearning for Large Language Models
☆37Updated last month
QingruZhang / PASTA
PASTA: Post-hoc Attention Steering for LLMs
☆120Updated 7 months ago
allenai / super-benchmark
☆45Updated 2 months ago
TianduoWang / DPO-ST
[ACL 2024] Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning
☆45Updated 11 months ago
RLHFlow / Directional-Preference-Alignment
Directional Preference Alignment
☆57Updated 9 months ago
THU-KEG / AdaptThink
☆116Updated last month
WindyLee0822 / Process_Q_Model
official implementation of paper "Process Reward Model with Q-value Rankings"
☆59Updated 4 months ago
kamanphoebe / Look-into-MoEs
[NAACL 2025] A Closer Look into Mixture-of-Experts in Large Language Models
☆52Updated 4 months ago