CarlanLark / Lp-RegLinks
Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward
☆33Updated 2 months ago
Alternatives and similar repositories for Lp-Reg
Users that are interested in Lp-Reg are comparing it to the libraries listed below
Sorting:
- ☆86Updated 10 months ago
- [BIRD-INTERACT] Re-imagines Text-to-SQL evaluation via lens of dynamic interactions.☆454Updated last month
- (EMNLP 2025 Findings) Source Evaluation scripts for Humanity's Last Code Exam☆95Updated 4 months ago
- The code for Refining Sentence Embedding Model through Ranking Sentences Generation with Large Language Models (Finding of ACL2025)☆83Updated 5 months ago
- MAX31855 full-featured driver library for general-purpose MCU and Linux.☆70Updated 2 months ago
- A light-weight framework for building llm agentic systems with additional supports for program synthesis and neural-symbolic research.☆87Updated 3 weeks ago
- The code to implement the distributional reinforcement learning algorithm.☆60Updated 4 months ago
- ☆71Updated 2 months ago
- ☆100Updated 11 months ago
- [ACL 2025 Oral] QAEncoder: Towards Aligned Representation Learning in Question Answering Systems☆176Updated 5 months ago
- 【最新国际股票】代号:Stock-Finex-多语言股票-功能:新股申购、大宗交易、股票配资、质押理财、在线客服-多国语言,最新股票源码-股票搭建-java股票☆80Updated 5 months ago
- ☆50Updated 8 months ago
- ☆75Updated 7 months ago
- The code for paper "Learning from Committee: Reasoning Distillation from a Mixture of Teachers with Peer-Review" accepted by ACL 2025.☆103Updated 7 months ago
- Evolve-AI☆40Updated 10 months ago
- TikTok emojis component library monorepo. Contains React and Vue 3 packages with 46 secret TikTok emojis (smile, happy, angry, etc.) usin…☆203Updated 4 months ago
- A sophisticated LangGraph-based agent that automates financial options analysis with real-time data from Polygon.io, smart caching, persi…☆52Updated last week
- ☆209Updated 2 months ago
- Using multiple regression model for analyzing and predicting the stock price☆41Updated 10 months ago
- ☆76Updated 3 weeks ago
- EmbodyHub☆79Updated 10 months ago
- 一个视频、Wifi融合的摔倒检测系统☆67Updated 3 months ago
- A modern web application for the Melbourne University Ultimate Frisbee Club, built with Next.js 15, TypeScript, and Tailwind CSS. This pl…☆101Updated 4 months ago
- ☆51Updated 2 months ago
- Official repository of DARE: dLLM Alignment and Reinforcement Executor☆119Updated this week
- Expanded netty to provide some tool classes, while supporting serial communication☆88Updated last week
- nextjs based personal blog page.☆82Updated 4 months ago
- ☆204Updated last year
- Go bindings for the CUDA Driver and Runtime APIs, cuBLAS, and cuDNN.☆154Updated 2 weeks ago
- A gRPC framework for Go that provides out-of-the-box gRPC service development experience.☆80Updated 2 months ago