ncsoft/offsetbias

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ncsoft/offsetbias)

ncsoft / offsetbias

Official implementation of "OffsetBias: Leveraging Debiased Data for Tuning Evaluators"

☆26

Alternatives and similar repositories for offsetbias

Users that are interested in offsetbias are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

joeljang / FLM
View on GitHub
All-in-one repository for Fine-tuning & Pretraining (Large) Language Models
☆15Mar 8, 2023Updated 3 years ago
ncsoft / ncresearch
View on GitHub
NC NLP Techblog. NC의 NLP가 열어갈 도전과 변화를 소개합니다.
☆22Jan 22, 2025Updated last year
kaistAI / Janus
View on GitHub
[NeurIPS 2024] Train LLMs with diverse system messages reflecting individualized preferences to generalize to unseen system messages
☆53Aug 10, 2025Updated 11 months ago
kaistAI / GAP
View on GitHub
[ACL 2023] Gradient Ascent Post-training Enhances Language Model Generalization
☆29Sep 12, 2024Updated last year
zankner / CLoud
View on GitHub
Critique-out-Loud Reward Models
☆76Oct 18, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
NAVER-Cloud-HyperCLOVA-X / hcx-vllm-plugin
View on GitHub
vLLM plugin for HyperCLOVAX
☆15Jan 27, 2026Updated 5 months ago
kaistAI / KtrlF
View on GitHub
[NAACL 2024] Official repository for "KTRL+F: Knowledge-Augmented In-Document Search"
☆23Oct 11, 2024Updated last year
MattYoon / reasoning-models-confidence
View on GitHub
[NeurIPS 2025] Reasoning Models Better Express Their Confidence"
☆23Nov 19, 2025Updated 8 months ago
AlignInc / aligner-replication
View on GitHub
The reproduct of the paper - Aligner: Achieving Efficient Alignment through Weak-to-Strong Correction
☆21May 29, 2024Updated 2 years ago
Marker-Inc-Korea / CoT-llama2
View on GitHub
Chain-of-thought 방식을 활용하여 llama2를 fine-tuning
☆10Nov 18, 2023Updated 2 years ago
IBM / benchbench
View on GitHub
A package dedicated for running benchmark agreement testing
☆19Sep 18, 2025Updated 10 months ago
DeepBaksuVision / You_Only_Look_Once
View on GitHub
☆10Dec 14, 2018Updated 7 years ago
r-three / realistic_evaluation_of_model_merging_for_compositional_generalization
View on GitHub
☆13Feb 11, 2026Updated 5 months ago
prometheus-eval / scaling-evaluation-compute
View on GitHub
Repository for "Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators"
☆12Mar 25, 2025Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
SeungoneKim / CoTEVer
View on GitHub
[EACL 2023] CoTEVer: Chain of Thought Prompting Annotation Toolkit for Explanation Verification
☆42Apr 29, 2023Updated 3 years ago
prometheus-eval / cmu-paper-reviewer
View on GitHub
Code repository for the "CMU Paper Reviewer System", a agentic system that generates reviews for academic papers.
☆25Jun 9, 2026Updated last month
kimyuji / EvolvingQA_benchmark
View on GitHub
Code and Dataset release of "Carpe Diem: On the Evaluation of World Knowledge in Lifelong Language Models" (NAACL 2024)
☆10Oct 16, 2024Updated last year
jiyounglee-0523 / FourierDecoder
View on GitHub
Official repository for Fourier model that can generate periodic signals
☆10Mar 10, 2022Updated 4 years ago
Sunkyoung / Compare-tokenizer
View on GitHub
Tokenizer 비교 실험
☆11Jan 3, 2022Updated 4 years ago
austrian-code-wizard / c3po
View on GitHub
☆30Apr 6, 2026Updated 3 months ago
kaistAI / FLASK
View on GitHub
[ICLR 2024 Spotlight] FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets
☆218Dec 24, 2023Updated 2 years ago
x66ccff / liveideabench
View on GitHub
[𝐍𝐚𝐭𝐮𝐫𝐞 𝐂𝐨𝐦𝐦𝐮𝐧𝐢𝐜𝐚𝐭𝐢𝐨𝐧𝐬] 🤖💡 LiveIdeaBench: Evaluating LLMs' Scientific Creativity and Idea Generation with Minimal C…
☆30Apr 21, 2026Updated 3 months ago
debjitpaul / Causal_CoT
View on GitHub
About The corresponding code from our paper " Making Reasoning Matter: Measuring and Improving Faithfulness of Chain-of-Thought Reasoning…
☆13Jan 14, 2026Updated 6 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
allenai / reward-bench
View on GitHub
RewardBench: the first evaluation tool for reward models.
☆727Feb 16, 2026Updated 5 months ago
daekeun-ml / evaluate-llm-on-korean-dataset
View on GitHub
Performs benchmarking on two Korean datasets with minimal time and effort.
☆45Jan 22, 2026Updated 6 months ago
corca-ai / evaluating-gpt-4o-on-CLIcK
View on GitHub
Evaluate gpt-4o on CLIcK (Korean NLP Dataset)
☆20May 18, 2024Updated 2 years ago
Taeu / HeLP-Challenge-Goldenpass
View on GitHub
☆11Mar 12, 2019Updated 7 years ago
cimm-kzn / RuDReC
View on GitHub
Russian Drug Reaction Corpus (RuDReC)
☆13Dec 29, 2020Updated 5 years ago
daniel-furman / polyglot-or-not
View on GitHub
Are foundation LMs multilingual knowledge bases? (EMNLP 2023)
☆18Dec 8, 2023Updated 2 years ago
juletx / self-translate
View on GitHub
Do Multilingual Language Models Think Better in English?
☆42Aug 3, 2023Updated 2 years ago
qbxlvnf11 / MultiWOZ2.1-parser
View on GitHub
MultiWOZ2.1-Parser for Dialogue State Tracking
☆13Aug 3, 2021Updated 4 years ago
joeljang / ELM
View on GitHub
[ICML 2023] Exploring the Benefits of Training Expert Language Models over Instruction Tuning
☆99Apr 26, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
eagle705 / awesome-nlp-note
View on GitHub
A curated list of resources dedicated to NLP (paper, blogs, note and etc)
☆13Nov 30, 2019Updated 6 years ago
bhpfelix / NDDR-CNN-PyTorch
View on GitHub
PyTorch Implementation of "NDDR-CNN: Layerwise Feature Fusing in Multi-Task CNNs by Neural Discriminative Dimensionality Reduction"
☆14Jun 29, 2019Updated 7 years ago
jo1jun / Vision_Transformer
View on GitHub
☆18May 16, 2021Updated 5 years ago
jonathan-roberts1 / SciFIBench
View on GitHub
NeurIPS 2024: SciFIBench: Benchmarking Large Multimodal Models for Scientific Figure Interpretation
☆13May 24, 2025Updated last year
PEBpung / MLOps-Tutorial
View on GitHub
Pytorch를 활용한 WandB의 Sweeps 🧹
☆15Dec 24, 2022Updated 3 years ago
OpenBMB / Eurus
View on GitHub
☆322Sep 18, 2024Updated last year
kaistAI / Knowledge-Entropy
View on GitHub
[ICLR 2025 Oral] Knowledge Entropy Decay during Language Model Pretraining Hinders New Knowledge Acquisition
☆17Nov 25, 2024Updated last year