Cohere-Labs-Community / m-rewardbenchLinks

Official Code for M-RᴇᴡᴀʀᴅBᴇɴᴄʜ: Evaluating Reward Models in Multilingual Settings (ACL 2025 Main)

☆35

Alternatives and similar repositories for m-rewardbench

Users that are interested in m-rewardbench are comparing it to the libraries listed below

Sorting:

guijinSON / MM-Eval
Official implementation for "MM-Eval: A Multilingual Meta-Evaluation Benchmark for LLM-as-a-Judge and Reward Models"
☆16Updated 11 months ago
bminixhofer / zett
Code for Zero-Shot Tokenizer Transfer
☆138Updated 9 months ago
SimengSun / alpaca_farm_lora
☆22Updated 2 years ago
PrasannS / rlhf-length-biases
☆27Updated last year
roeehendel / icl_task_vectors
☆98Updated last year
aviclu / ffn-values
☆64Updated 2 years ago
hamishivi / EasyLM
Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…
☆75Updated last year
lukemelas / mtob
☆40Updated last year
yuzhaouoe / pretraining-data-packing
[ACL'24 Oral] Analysing The Impact of Sequence Composition on Language Model Pre-Training
☆22Updated last year
mlfoundations / scaling
Language models scale reliably with over-training and on downstream tasks
☆100Updated last year
evandez / REMEDI
Inspecting and Editing Knowledge Representations in Language Models
☆117Updated 2 years ago
hadasah / btm
☆76Updated last year
tau-nlp / scrolls
The official code of EMNLP 2022, "SCROLLS: Standardized CompaRison Over Long Language Sequences".
☆69Updated last year
epfl-dlab / llm-latent-language
Repo accompanying our paper "Do Llamas Work in English? On the Latent Language of Multilingual Transformers".
☆80Updated last year
kernelmachine / demix-data
Benchmark API for Multidomain Language Modeling
☆25Updated 3 years ago
protagolabs / odyssey-math
☆83Updated 8 months ago
kaistAI / LangBridge
[ACL 2024] LangBridge: Multilingual Reasoning Without Multilingual Supervision
☆93Updated 11 months ago
lmarena / PPE
☆53Updated 5 months ago
shayne-longpre / a-pretrainers-guide
☆72Updated 2 years ago
TristanThrush / perplexity-correlations
Simple and scalable tools for data-driven pretraining data selection.
☆28Updated 4 months ago
tianjunz / HIR
☆159Updated 2 years ago
mega002 / ff-layers
The accompanying code for "Transformer Feed-Forward Layers Are Key-Value Memories". Mor Geva, Roei Schuster, Jonathan Berant, and Omer Le…
☆96Updated 4 years ago
liujch1998 / memo-trap
☆22Updated 2 years ago
joeljang / RLPHF
Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging
☆110Updated last year
ndaheim / faithful-dialogue
☆24Updated 2 years ago
mt-upc / transformer-contributions
Measuring the Mixing of Contextual Information in the Transformer
☆31Updated 2 years ago
UFO-101 / auto-circuit
A library for efficient patching and automatic circuit discovery.
☆77Updated 2 months ago
anadim / the-little-retrieval-test
☆34Updated 2 years ago
meg-tong / sycophancy-eval
datasets from the paper "Towards Understanding Sycophancy in Language Models"
☆94Updated last year
Nanami18 / Snowballed_Hallucination
☆44Updated last year