ZNLP/Language-Imbalance-Driven-Rewarding

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ZNLP/Language-Imbalance-Driven-Rewarding)

ZNLP / Language-Imbalance-Driven-Rewarding

[ICLR 2025] Language Imbalance Driven Rewarding for Multilingual Self-improving

☆25

Alternatives and similar repositories for Language-Imbalance-Driven-Rewarding

Users that are interested in Language-Imbalance-Driven-Rewarding are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

yihedeng9 / DuoGuard
View on GitHub
DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails
☆34Feb 26, 2025Updated last year
NJUNLP / x-LLM
View on GitHub
☆25Oct 6, 2023Updated 2 years ago
Trae1ounG / Pretrain_Space_RLVR
View on GitHub
[arxiv: 2604.14142] From P(y|x) to P(y): Investigating Reinforcement Learning in Pre-train Space
☆17Apr 16, 2026Updated 3 months ago
gpengzhi / CrossConST-MT
View on GitHub
Code for Findings of ACL 2023 paper "Improving Zero-shot Multilingual Neural Machine Translation by Leveraging Cross-lingual Consistency …
☆10Jul 18, 2023Updated 3 years ago
maitrix-org / dynamic-alignment-optimization
View on GitHub
[EMNLP'24 (Main)] DRPO(Dynamic Rewarding with Prompt Optimization) is a tuning-free approach for self-alignment. DRPO leverages a search-…
☆24Nov 17, 2024Updated last year
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
October2001 / ProLong
View on GitHub
[ACL 2024 (Oral)] A Prospector of Long-Dependency Data for Large Language Models
☆61Jul 23, 2024Updated 2 years ago
holarissun / RewardModelingBeyondBradleyTerry
View on GitHub
official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and…
☆73Apr 2, 2025Updated last year
eternal8080 / MV-MATH
View on GitHub
Description for MV-MATH
☆15Jul 20, 2025Updated last year
circle-hit / Lens
View on GitHub
Code for our paper titled "Lens: Rethinking Multilingual Enhancement for Large Language Models"
☆12Oct 15, 2024Updated last year
MozerWang / AMPO
View on GitHub
[ICLR 2026] Adaptive Social Learning via Mode Policy Optimization for Language Agents
☆51Feb 2, 2026Updated 5 months ago
stefanhgm / patient_summaries_with_llms
View on GitHub
Code for "A Data-Centric Approach To Generate Faithful and High Quality Patient Summaries with Large Language Models"
☆17Jul 20, 2025Updated last year
JinLi-i / MoDiCF
View on GitHub
The source code of [WWW 2025] MoDiCF
☆16Mar 26, 2026Updated 3 months ago
chenlong-clock / RULE-Unlearn
View on GitHub
[NeurIPS25] RULE: Reinforcement UnLEarning Achieves Forge-retain Pareto Optimality
☆20Oct 22, 2025Updated 9 months ago
ncsu-dk-lab / Acc-DD
View on GitHub
☆14Apr 21, 2023Updated 3 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
CCIIPLab / CET
View on GitHub
This is the source code for: Context-aware Entity Typing in Knowledge Graphs.
☆16May 10, 2022Updated 4 years ago
dahui888 / GoodTrash
View on GitHub
垃圾分类App、展示垃圾分类新闻、可以通过文字、语音、图像等方式输入想要进行分类的物品。
☆13May 13, 2021Updated 5 years ago
TraceElephant / TraceElephant
View on GitHub
Repo of "Seeing the Whole Elephant: A Benchmark for Failure Attribution in LLM-based Multi-Agent Systems" (ACL 2026)
☆16Apr 27, 2026Updated 2 months ago
hemingkx / SWIFT
View on GitHub
[ICLR 2025] SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration
☆70Feb 21, 2025Updated last year
CogComp / faithful_summarization
View on GitHub
☆18May 5, 2021Updated 5 years ago
cwang621 / blsp
View on GitHub
BLSP: Bootstrapping Langauge-Speech Pre-training via Behavior Alignment of Continuation Writing
☆59Mar 11, 2024Updated 2 years ago
CASIA-LM / OpenS2S
View on GitHub
OpenS2S : Advancing Fully Open-Source End-to-End Empathetic Large Speech Language Model
☆119Mar 28, 2026Updated 3 months ago
HenryCai11 / LLM-Self-Control
View on GitHub
The official repo of paper "Self-Control of LLM Behaviors by Compressing Suffix Gradient into Prefix Controller"
☆18Aug 13, 2024Updated last year
RulinShao / RAG-evaluation-harnesses
View on GitHub
An evaluation suite for Retrieval-Augmented Generation (RAG).
☆25Apr 26, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
nuprl / MultiPL-T
View on GitHub
Knowledge transfer from high-resource to low-resource programming languages for Code LLMs
☆17Aug 12, 2025Updated 11 months ago
Lichang-Chen / AlpaGasus
View on GitHub
A better Alpaca Model Trained with Less Data (only 9k instructions of the original set)
☆24Jul 26, 2024Updated last year
Geaming2002 / Ruler
View on GitHub
Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models
☆40Sep 30, 2024Updated last year
Ultramarine-spec / UCAS_Course
View on GitHub
中国科学院大学（国科大）研一课程
☆19May 24, 2023Updated 3 years ago
uclaml / PDE
View on GitHub
Official repo of Progressive Data Expansion: data, code and evaluation
☆29Nov 16, 2023Updated 2 years ago
yulu-dada / Learned-conf-NMT
View on GitHub
☆16Mar 11, 2022Updated 4 years ago
rohan598 / ConTextual
View on GitHub
☆27Jul 20, 2024Updated 2 years ago
yeeeqichen / FiTs
View on GitHub
[AAAI 2023] Official implementation of FiTs: Fine-grained Two-stage Training for Knowledge Base Question Answering
☆11Mar 10, 2023Updated 3 years ago
HollyLee2000 / SeBoW-paddle
View on GitHub
This is the paddle code for SeBoW(Self-Born wiring for neural trees), a kind of neural tree born form a large search space
☆11Dec 10, 2021Updated 4 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
mandyyyyii / east
View on GitHub
☆19Aug 4, 2025Updated 11 months ago
tylerdotai / meta-harness-evolver
View on GitHub
Meta-Harness: End-to-End Optimization of LLM Harnesses — OpenClaw Agent Evolution System
☆15Updated this week
rvenet / RVENet
View on GitHub
Source code related to the research paper entitled RVENet: A Large Echocardiographic Dataset for the Deep Learning-Based Assessment of Ri…
☆12Mar 10, 2024Updated 2 years ago
VITA-Group / SEAL
View on GitHub
[COLM 2025] SEAL: Steerable Reasoning Calibration of Large Language Models for Free
☆60Apr 6, 2025Updated last year
ljcleo / agent_sense
View on GitHub
Benchmarking Social Intelligence of Language Agents through Interactive Scenarios
☆13Jan 4, 2025Updated last year
1171-jpg / BrainTeaser
View on GitHub
☆16Feb 1, 2024Updated 2 years ago
YuYang0901 / LaViSE
View on GitHub
Explaining Deep Convolutional Neural Networks via Unsupervised Visual-Semantic Filter Attention (CVPR 2022)
☆20Mar 31, 2022Updated 4 years ago