Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward
☆33Oct 5, 2025Updated 7 months ago
Alternatives and similar repositories for Lp-Reg
Users that are interested in Lp-Reg are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward☆44Nov 18, 2025Updated 5 months ago
- [ICLRW 2026 Best Short Paper Award] Visual Exclusivity Attacks: Automatic Multimodal Red Teaming via Agentic Planning☆86Apr 15, 2026Updated 3 weeks ago
- [EMNLP 2023] Question Answering as Programming for Solving Time-Sensitive Questions☆12Dec 18, 2023Updated 2 years ago
- AI Phone Agent: A starter kit to build AI agents that answer real phone calls and talk to customers in real time (OpenAI Realtime). Node.…☆53Apr 18, 2026Updated 2 weeks ago
- 红黑树的实现和分析(SDU CS Data Structures and Algorithms Course Design)☆12Jan 9, 2025Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- [EMNLP'22] Title2Event: Benchmarking Open Event Extraction with a Large-scale Chinese Title Dataset☆20Apr 4, 2023Updated 3 years ago
- The code of paper "Toward Optimal LLM Alignments Using Two-Player Games".☆17Jun 20, 2024Updated last year
- Code for the paper - Controlling Dialogue Generation with Semantic Exemplars (Naacl 2021) A semantic exemplar based retrieve-refine appro…☆18Mar 26, 2021Updated 5 years ago
- INSCIT: Information-Seeking Conversations with Mixed-Initiative Interactions☆16Jan 21, 2025Updated last year
- Adaptive Weight Scheduling for Multi-Objective GRPO in Code Generation. Fixed multi-objective rewards cause reward hacking (short but bro…☆49Apr 14, 2026Updated 3 weeks ago
- ☆22Oct 20, 2022Updated 3 years ago
- Code for EMNLP 2023 long paper: An Iteratively Parallel Generation Method with the Pre-Filling Strategy for Document-level Event Extracti…☆19Feb 2, 2025Updated last year
- The official implementation of "Black Sheep in the Herd: Playing with Spuriously Correlated Attributes for Vision-Language Recognition" …☆73Apr 4, 2026Updated last month
- Source code for "A Two-Stream AMR-enhanced Model for Document-level Event Argument Extraction" @ NAACL 2022☆19May 1, 2022Updated 4 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- The code for the paper "Conditional Temporal Variational AutoEncoder for Action Video Prediction“☆81Mar 27, 2022Updated 4 years ago
- 【java+springboot+vue3】三勾点餐系统,校园点餐系统,门店点餐系统,三勾餐饮系统,校园餐饮系统,门店 餐饮系统☆113May 21, 2025Updated 11 months ago
- USTC研究生学术报告选课脚本☆18Dec 6, 2022Updated 3 years ago
- 一款基于 SOTA 模型 BiRefNet 开发的高精度 AI 抠图工具☆61Jan 22, 2026Updated 3 months ago
- HY-SOAR:Self-Correction for Optimal Alignment and Refinement in Diffusion Models☆342Apr 21, 2026Updated 2 weeks ago
- Official code of the paper "Rethinking Infrared Small Target Detection: A Foundation- Driven Efficient Paradigm"☆42Dec 8, 2025Updated 4 months ago
- A Linux mini container runtime written in Go☆161Dec 28, 2025Updated 4 months ago
- Code for ACL 2024 long paper: Are AI-Generated Text Detectors Robust to Adversarial Perturbations?☆33Jul 12, 2024Updated last year
- ☆48Nov 11, 2025Updated 5 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Srouce code for SIGIR 2023 paper☆24Jul 31, 2023Updated 2 years ago
- 轻小说文库 epub 解析打包☆21May 3, 2020Updated 6 years ago
- ☆27Mar 13, 2024Updated 2 years ago
- NuGet Go SDK☆31Apr 16, 2026Updated 2 weeks ago
- 算法与编程练习册答案,个人答案供同学们参考。 | Help classmates learn algorithms - design patterns.☆77Jan 22, 2026Updated 3 months ago
- ☆25Dec 6, 2022Updated 3 years ago
- Yichi Zhang et al. A Probabilistic End-To-End Task-Oriented Dialog Model with Latent Belief States towards Semi-Supervised Learning. EMNL…☆20Nov 5, 2020Updated 5 years ago
- AI-powered tool for analyzing GitHub trending repositories and URL metadata☆25Updated this week
- Failure-first AI regression testing CLI for turning AI failures into local regression assets and PR gates. 把真实 AI 失败快速变成可执行回归资产和防止再次犯错清单。☆81Apr 10, 2026Updated 3 weeks ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆24Aug 16, 2024Updated last year
- The first Object-Oriented Programming (OOP) Evaluation Benchmark for LLMs☆27Jan 15, 2025Updated last year
- Intelligent job recommendation platform using Java + MySQL + Redis. Supports location-based search, AI keyword extraction, and personaliz…☆225Aug 31, 2025Updated 8 months ago
- Source code for "A Two-Stream AMR-enhanced Model for Document-level Event Argument Extraction" @ NAACL 2022☆37May 7, 2022Updated 4 years ago
- Official implementation for "Law of the Weakest Link: Cross capabilities of Large Language Models"☆43Oct 1, 2024Updated last year
- Telegram AI assistant based on LangGraph, supporting long-term memory, web search, in-depth research, and multi-user permission managemen…☆68Dec 27, 2025Updated 4 months ago
- DocEE: A Large-Scale and Fine-grained Benchmark for Document-level Event Extraction☆40Apr 19, 2023Updated 3 years ago