ZHZisZZ/weak-to-strong-search

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ZHZisZZ/weak-to-strong-search)

ZHZisZZ / weak-to-strong-search

[NeurIPS'24] Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models

☆67

Alternatives and similar repositories for weak-to-strong-search

Users that are interested in weak-to-strong-search are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ZHZisZZ / emulated-disalignment
View on GitHub
[ACL'24, Outstanding Paper] Emulated Disalignment: Safety Alignment for Large Language Models May Backfire!
☆39Aug 2, 2024Updated last year
jonnypei / acl23-preadd
View on GitHub
☆12Jul 25, 2023Updated 2 years ago
Zanette-Labs / SpeculativeRejection
View on GitHub
[NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejection
☆56Oct 29, 2024Updated last year
ZHZisZZ / modpo
View on GitHub
[ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization
☆101Aug 20, 2024Updated last year
FreedomIntelligence / OVM
View on GitHub
☆74Apr 2, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Yuancheng-Xu / GenARM
View on GitHub
Code for ICLR 2025 Paper "GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-time Alignment"
☆24Feb 10, 2025Updated last year
EIT-NLP / SkipGPT
View on GitHub
[ICML 2025] Official implementation of the paper "SkipGPT: Dynamic Layer Pruning Reinvented with Token Awareness and Module Decoupling". …
☆21Nov 17, 2025Updated 8 months ago
JamyDon / PLM-based-CGEC-Model-Ensemble
View on GitHub
[ACL 2023] Are Pre-trained Language Models Useful for Model Ensemble in Chinese Grammatical Error Correction?
☆10Dec 15, 2025Updated 7 months ago
Will-Nie / AutoLinePlotter
View on GitHub
This repo support auto line plot for multi-seed event file from TensorBoard
☆12Jun 23, 2022Updated 4 years ago
aryamanarora / bayesian-laws-icl
View on GitHub
Bayesian scaling laws for in-context learning.
☆16Mar 12, 2025Updated last year
starrYYxuan / LeCo
View on GitHub
This the implementation of LeCo
☆33Jan 20, 2025Updated last year
alisawuffles / proxy-tuning
View on GitHub
Code associated with Tuning Language Models by Proxy (Liu et al., 2024)
☆134Mar 30, 2024Updated 2 years ago
alexrame / rewardedsoups
View on GitHub
Rewarded soups official implementation
☆64Sep 27, 2023Updated 2 years ago
Callione / LLaVA-MOSS2
View on GitHub
Modified LLaVA framework for MOSS2, and makes MOSS2 a multimodal model.
☆13Sep 19, 2024Updated last year
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
okarthikb / DPO
View on GitHub
Implementation of Direct Preference Optimization
☆17Jul 17, 2023Updated 3 years ago
tmlr-group / NoisyRationales
View on GitHub
[NeurIPS 2024] "Can Language Models Perform Robust Reasoning in Chain-of-thought Prompting with Noisy Rationales?"
☆40Jul 18, 2025Updated last year
niconi19 / Emergent-Response-Planning-in-LLMs
View on GitHub
[ICML 2025] Emergent Response Planning in LLMs
☆20Jul 1, 2025Updated last year
MANGA-UOFA / PTfer
View on GitHub
☆11Nov 13, 2024Updated last year
jxzhangjhu / awesome-LLM-controlled-decoding-generation
View on GitHub
awesome-LLM-controlled-constrained-generation
☆57Aug 16, 2024Updated last year
Leooyii / LCEG
View on GitHub
[COLM'25] A Controlled Study on Long Context Extension and Generalization in LLMs
☆65Mar 9, 2026Updated 4 months ago
WeiminXiong / IPR
View on GitHub
Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)
☆68Oct 18, 2024Updated last year
zwhong714 / weak-to-strong-preference-optimization
View on GitHub
[ICLR 2025 Spotlight] Weak-to-strong preference optimization: stealing reward from weak aligned model
☆18Feb 24, 2025Updated last year
Lingkai-Kong / RE-Control
View on GitHub
Code for paper: Aligning Large Language Models with Representation Editing: A Control Perspective
☆35Jan 31, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
MANGA-UOFA / fdistill
View on GitHub
☆22Feb 4, 2026Updated 5 months ago
Nicolas-BZRD / llm-distillation
View on GitHub
☆11Feb 3, 2025Updated last year
PKU-Alignment / aligner
View on GitHub
[NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct
☆193Jan 16, 2025Updated last year
mjy1111 / PEAK
View on GitHub
The repository for our paper: Neighboring Perturbations of Knowledge Editing on Large Language Models
☆16May 4, 2024Updated 2 years ago
SparkJiao / dpo-trajectory-reasoning
View on GitHub
[EMNLP 2024] Source code for the paper "Learning Planning-based Reasoning with Trajectory Collection and Process Rewards Synthesizing".
☆84Jan 14, 2025Updated last year
yifeiwang77 / Self-Correction
View on GitHub
☆20Nov 3, 2024Updated last year
liutianlin0121 / decoding-time-realignment
View on GitHub
Implementation of "Decoding-time Realignment of Language Models", ICML 2024.
☆21Jun 17, 2024Updated 2 years ago
whyNLP / Conic10K
View on GitHub
Conic10K: A large-scale dataset for closed-vocabulary math problem understanding. Accepted to EMNLP2023 Findings.
☆33Dec 6, 2023Updated 2 years ago
AI4fun / DQ-LoRe
View on GitHub
☆13Jun 26, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
F2-Song / Weak-to-Strong-Decoding
View on GitHub
The official implementation of "Well Begun is Half Done: Low-resource Preference Alignment by Weak-to-Strong Decoding"
☆22Jun 26, 2025Updated last year
sail-sg / CPO
View on GitHub
[NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.
☆137Mar 21, 2025Updated last year
Pranjal2041 / AdaptiveConsistency
View on GitHub
Let's Sample Step by Step: Adaptive-Consistency for Efficient Reasoning with LLMs
☆41Jan 30, 2024Updated 2 years ago
shizhediao / Black-Box-Prompt-Learning
View on GitHub
Source code for the TMLR paper "Black-Box Prompt Learning for Pre-trained Language Models"
☆59Sep 7, 2023Updated 2 years ago
Jacob-Zhou / gecdi
View on GitHub
The repo of "Improving Seq2Seq Grammatical Error Correction via Decoding Interventions"
☆32Jan 22, 2024Updated 2 years ago
zju-vipa / training_free_model_merging
View on GitHub
This repository is the implementation of the paper Training Free Pretrained Model Merging (CVPR2024).
☆34Mar 5, 2024Updated 2 years ago
HanNight / AdaCAD
View on GitHub
Code for NAACL 2025 paper "AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge"
☆16Mar 2, 2026Updated 4 months ago