OpenMOSS/BandPO

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/OpenMOSS/BandPO)

OpenMOSS / BandPO

Official implementation of BandPO: Bridging Trust Regions and Ratio Clipping via Probability-Aware Bounds for LLM Reinforcement Learning. BandPO replaces canonical clipping (PPO/GRPO) with dynamic bounds to resolve exploration bottlenecks and prevent entropy collapse.

☆49

Alternatives and similar repositories for BandPO

Users that are interested in BandPO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Linxi000 / MEDS
View on GitHub
☆142Jun 24, 2026Updated 3 weeks ago
JT-Ushio / AI-Infra-Seminar
View on GitHub
☆24Jul 20, 2025Updated last year
haowei-freesky / HERMES
View on GitHub
Official Repository for paper "HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding" [ACL 2026]
☆92May 8, 2026Updated 2 months ago
RayYuki / CodecBench
View on GitHub
☆24Nov 16, 2025Updated 8 months ago
OpenMOSS / MOSS-Audio-Tokenizer
View on GitHub
MOSS-Audio-Tokenizer is a Causal Transformer-based audio tokenizer built on the CAT architecture. Trained on 3M hours of diverse audio, i…
☆245Jun 16, 2026Updated last month
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
xinghaow99 / prism
View on GitHub
[ICML 2026] Prism: Spectral-Aware Block-Sparse Attention
☆27May 22, 2026Updated last month
GUI-Libra / GUI-Libra
View on GitHub
Official code for paper "GUI-Libra: Training Native GUI Agents to Reason and Act with Action-aware Supervision and Partially Verifiable R…
☆67Mar 29, 2026Updated 3 months ago
alibaba / vstyle
View on GitHub
☆34Sep 15, 2025Updated 10 months ago
Hesse73 / RLVR-Directions
View on GitHub
Source Code for our ICLR'26 paper
☆17Feb 22, 2026Updated 4 months ago
pedr0sorio / lefusion-slicer
View on GitHub
3DSlicer plugin for inpainting lung nodules in 3D chest CT data.
☆11Dec 2, 2024Updated last year
VITA-Group / TAPE
View on GitHub
[ICML'25] "Rethinking Addressing in Language Models via Contextualized Equivariant Positional Encoding" by Jiajun Zhu, Peihao Wang, Ruisi…
☆15Jun 6, 2025Updated last year
yangdongchao / ALMTokenizer
View on GitHub
The demo page for ALMTokenizer
☆59Apr 14, 2025Updated last year
euReKa025 / AgentLongBench
View on GitHub
☆21Jan 29, 2026Updated 5 months ago
OpenMOSS / MOSS-VL
View on GitHub
MOSS-VL is the core multimodal model series within the OpenMOSS ecosystem, dedicated to visual understanding.
☆375Updated this week
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
EmbodiedForge / Inspire-cli
View on GitHub
A tool for better use of Inspire platform (Beta: Codeberg version is more up-to-date)
☆27Apr 2, 2026Updated 3 months ago
OpenMOSS / UnifiedToolHub
View on GitHub
UnifiedToolHub is a comprehensive project supporting LLM-based tool use, designed to unify various tool-use dataset formats and provide t…
☆22Jul 23, 2025Updated 11 months ago
thu-spmi / CTC-TTS
View on GitHub
Code for CTC-TTS: LLM-based dual-streaming text-to-speech with CTC alignment, Interspeech 2026.
☆20Jun 9, 2026Updated last month
tongjingqi / Thinking-with-Video
View on GitHub
We introduce 'Thinking with Video', a new paradigm leveraging video generation for multimodal reasoning. Our VideoThinkBench shows that S…
☆314Jun 21, 2026Updated 3 weeks ago
MasterVito / DAC-RL
View on GitHub
Official Repo for DAC-RL: Training LLMs for Divide-and-Conquer Reasoning Elevates Test-Time Scalability
☆16Feb 26, 2026Updated 4 months ago
Trae1ounG / Pretrain_Space_RLVR
View on GitHub
[arxiv: 2604.14142] From P(y|x) to P(y): Investigating Reinforcement Learning in Pre-train Space
☆17Apr 16, 2026Updated 3 months ago
OpenMOSS / MOSS-Music
View on GitHub
MOSS-Music is an open-source music understanding model for targeting musical captioning, lyrics ASR, structural analysis, chord / key / t…
☆117May 9, 2026Updated 2 months ago
usail-hkust / Agent-Omit
View on GitHub
Agent-Omit: Training Efficient LLM Agents for Adaptive Thought and Observation Omission via Reinforcement Learning
☆31May 11, 2026Updated 2 months ago
mlsys-io / Halo_demo
View on GitHub
A novel system that unifies LLM serving with query optimization to efficiently process batch agentic workflows.
☆15Jun 14, 2026Updated last month
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
OpenMOSS / MOSS-Audio
View on GitHub
MOSS-Audio is an open-source foundation model for unified audio understanding, enabling speech, sound, music, captioning, QA, and reasoni…
☆609Jun 2, 2026Updated last month
nonverbalspeech38k / nonverspeech38k
View on GitHub
The official repository for the paper “NonVerbalSpeech-38K: A Scalable Pipeline for Enabling Non-Verbal Speech Generation and Understandi…
☆68Dec 26, 2025Updated 6 months ago
KexinHUANG19 / InstructTTSEval
View on GitHub
☆51Jun 25, 2025Updated last year
OpenMOSS / MOSS-Transcribe-Diarize
View on GitHub
MOSS-Transcribe-Diarize 0.9B is an open-source SOTA end-to-end audio understanding model for long-form multi-speaker transcription, diari…
☆811Updated this week
tongjingqi / AI-Can-Learn-Scientific-Taste
View on GitHub
We propose Reinforcement Learning from Community Feedback (RLCF), a training paradigm that uses large-scale community signals as supervis…
☆424Jul 13, 2026Updated last week
alialsartawi7-sketch / ghosttrace
View on GitHub
Modular OSINT and attack surface analysis platform for authorized security research, with risk scoring, attack paths, and report generati…
☆24Jun 4, 2026Updated last month
OpenMOSS / FutureOmni
View on GitHub
☆26Jan 22, 2026Updated 5 months ago
tianyilt / qzcli_tool
View on GitHub
启智平台任务管理 CLI：资源查询、任务提交、日志查看和 MCP/agent workflow
☆107Updated this week
open-compass / Creation-MMBench
View on GitHub
Assessing Context-Aware Creative Intelligence in MLLMs
☆23Jul 22, 2025Updated 11 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
Aurora-slz / Synth-Empathy
View on GitHub
Synth-Empathy: Towards High-Quality Synthetic Empathy Data
☆18Feb 28, 2025Updated last year
OpenMOSS / claude-codex-handoff
View on GitHub
Drop-in async file-based handoff protocol for two AI coding agents (Claude Code + Codex), installed as one shared .handoff/ in your proje…
☆29Jul 4, 2026Updated 2 weeks ago
cg1177 / Recursive-Multimodal-Agent
View on GitHub
☆19Jul 1, 2026Updated 2 weeks ago
YuLiu-LY / SlotLifter
View on GitHub
Code for "SlotLifter: Slot-guided Feature Lifting for Learning Object-centric Radiance Fields" (ECCV 2024)
☆12Oct 30, 2024Updated last year
Aurora-slz / MM-Verify
View on GitHub
☆19Oct 28, 2025Updated 8 months ago
Miaow-Lab / RLVR-Linearity
View on GitHub
[arXiv] "Linear Dynamics in the RLVR Training of Large Language Models"
☆17May 25, 2026Updated last month
mlsys-io / helium_demo
View on GitHub
☆24May 2, 2026Updated 2 months ago