CyberAgentAILab/regularized-bon

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/CyberAgentAILab/regularized-bon)

CyberAgentAILab / regularized-bon

Code of "Regularized Best-of-N Sampling with Minimum Bayes Risk Objective for Language Model Alignment" (2025).

☆14

Alternatives and similar repositories for regularized-bon

Users that are interested in regularized-bon are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

CyberAgentAILab / filtered-dpo
View on GitHub
[EMNLP 2024] Introducing Filtered Direct Preference Optimization (fDPO) that enhances language model alignment with human preferences by …
☆16Nov 27, 2024Updated last year
RUCAIBox / CARP
View on GitHub
☆17Jun 14, 2023Updated 3 years ago
ssokota / mmd
View on GitHub
Code for magnetic mirror descent.
☆20Oct 5, 2023Updated 2 years ago
gl-ybnbxb / BoNBoN
View on GitHub
☆19Jun 3, 2024Updated 2 years ago
KEAML-JLU / SimSTC
View on GitHub
The source code for "A Simple Graph Contrastive Learning Framework for Short Text Classification"
☆13Aug 14, 2025Updated 11 months ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
Language-Media-Lab / commonsense-moral-ja
View on GitHub
☆15Nov 20, 2025Updated 8 months ago
CatherineMeng / FGYM-user-demo
View on GitHub
Demonstrating the usage of FGYM: A Toolkit for benchmarking FPGA-accelerated Reinforcement Learning
☆14Aug 12, 2021Updated 4 years ago
vaaaaanquish / docker-UTH-BERT
View on GitHub
docker for UTH-BERT: https://ai-health.m.u-tokyo.ac.jp/uth-bert
☆14Mar 24, 2023Updated 3 years ago
kristychoi / pixel_exploration
View on GitHub
PyTorch implementation of Count-Based Exploration with Neural Density Models
☆10Mar 22, 2018Updated 8 years ago
hoffa / year-on-a-page
View on GitHub
The entire year on a single page
☆12Dec 5, 2025Updated 7 months ago
purbeshmitra / MOTIF
View on GitHub
MOTIF: Modular Thinking via Reinforcement Fine-tuning in LLMs
☆17Jul 6, 2025Updated last year
pfnet-research / jfbench
View on GitHub
☆15Mar 12, 2026Updated 4 months ago
Likon69 / CopilotBuddy
View on GitHub
Public WotLK 3.3.5a bot in C#/WPF. API surface ported from Honorbuddy, retargeted at build 12340 and custom servers. │ Botbases, navig…
☆15Jul 13, 2026Updated last week
tianyi-lab / R2-T2
View on GitHub
[ICML 2025] Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"
☆19Mar 10, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
yxdydgithub / difftalk_preprocess
View on GitHub
☆13May 11, 2024Updated 2 years ago
shankarp8 / knowledge_distillation
View on GitHub
Repository for "Propagating Knowledge Updates to LMs Through Distillation" (NeurIPS 2023).
☆27Aug 25, 2024Updated last year
mbchang / decentralized-rl
View on GitHub
Decentralized Reinforcment Learning: Global Decision-Making via Local Economic Transactions (ICML 2020)
☆43Dec 8, 2022Updated 3 years ago
SerenaTetart / MultiboxBot
View on GitHub
MultiboxBot is a bot for multiboxing on WoW with up to 40 accounts using DLL injection, hooking and sockets.
☆18Jul 8, 2026Updated 2 weeks ago
ebonyfaye / ema
View on GitHub
☆10Feb 12, 2026Updated 5 months ago
gabry1998 / Self-Supervised-Anomaly-Detection
View on GitHub
Thesis project about Visual Anomaly Detection based on Self Supervised Learning. The model identifies anomalies from information acquired…
☆10Apr 14, 2023Updated 3 years ago
CSSLab / ThinkTwice
View on GitHub
Jointly Optimizing Large Language Models for Reasoning and Self-Refinement
☆15Apr 22, 2026Updated 3 months ago
Hritikbansal / jpo
View on GitHub
☆13Jul 2, 2025Updated last year
Lagooon / LeanSTaR
View on GitHub
☆44Sep 19, 2024Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
registor / nwafuCoursePaper
View on GitHub
一个用于课程小论文排版的LaTeX模板。
☆10Oct 21, 2019Updated 6 years ago
fastlabel / fastlabel-python-sdk
View on GitHub
The official Python SDK for FastLabel API, the Data Platform for AI
☆16Updated this week
ContextualAI / CLAIR_and_APO
View on GitHub
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment
☆62Aug 30, 2024Updated last year
Kwai-Klear / CE-GPPO
View on GitHub
CE-GPPO: Controlling Entropy via Gradient-Preserving Clipping Policy Optimization in Reinforcement Learning
☆16Jan 23, 2026Updated 5 months ago
pfnet-research / label-efficient-brain-tumor-segmentation
View on GitHub
☆21Sep 24, 2020Updated 5 years ago
SunDoge / bytepiece-rs
View on GitHub
更纯粹、更高压缩率的Tokenizer in Rust
☆14Dec 21, 2024Updated last year
portal-cornell / muCode
View on GitHub
☆33Oct 2, 2025Updated 9 months ago
liumy2010 / LiteEFG
View on GitHub
Python library for solving Extensive-form Games and implementation of various baseline algorithms (e.g., Counterfactual Regret Minimizati…
☆37Feb 12, 2026Updated 5 months ago
spinute / go-by-example
View on GitHub
ウェブサイト「サンプルで学ぶ Go 言語」のソースコード
☆17Aug 17, 2024Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
apartresearch / mechanisticinterpretability
View on GitHub
A repository for awesome resources in mechanistic interpretability
☆16Jan 18, 2023Updated 3 years ago
seanie12 / ThinkSafe
View on GitHub
☆19May 4, 2026Updated 2 months ago
brownhci / SleepCoacher
View on GitHub
SleepCoacher recommendation engine, reference implementation for paper at http://jeffhuang.com/Final_SleepCoacher_UIST16.pdf
☆26Oct 6, 2016Updated 9 years ago
hackersground-kr / hackers-ground
View on GitHub
해커그라운드 해커톤 2024
☆12Aug 26, 2024Updated last year
aahmed-se / python-vector-search-tutorial-gpt4
View on GitHub
Python Vector Search tutorial generated using gpt4
☆12Mar 18, 2023Updated 3 years ago
eric-mitchell / concord
View on GitHub
☆14Nov 15, 2022Updated 3 years ago
Smu-Tan / Remedy
View on GitHub
[EMNLP2025] Remedy: Learning Machine Translation Evaluation from Human Preferences with Reward Modeling
☆16Nov 20, 2025Updated 8 months ago