CarlanLark / Lp-RegLinks
Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward
☆33Updated 4 months ago
Alternatives and similar repositories for Lp-Reg
Users that are interested in Lp-Reg are comparing it to the libraries listed below
Sorting:
- ☆84Updated 11 months ago
- (EMNLP 2025 Findings) Source Evaluation scripts for Humanity's Last Code Exam☆95Updated 5 months ago
- Official Implementation of FastMCTS: A Simple Sampling Strategy for Data Synthesis☆112Updated 7 months ago
- ☆209Updated 3 months ago
- Using multiple regression model for analyzing and predicting the stock price☆41Updated 11 months ago
- ☆50Updated 9 months ago
- 借鉴一下大佬的思路,少部分原创☆71Updated 9 months ago
- ☆99Updated last year
- Inspired by Recognition and Estimation of Human Finger Pointing (Authors: Eran Bamani, Eden Nissinman, Lisa Koenigsberg, Inbar Meir, Yoa…☆82Updated 10 months ago
- An Interaction Fiction Demo Powered AI Dungeon☆84Updated 4 months ago
- EmbodyHub☆79Updated last year
- The code for paper "Learning from Committee: Reasoning Distillation from a Mixture of Teachers with Peer-Review" accepted by ACL 2025.☆103Updated last week
- nextjs based personal blog page.☆82Updated 5 months ago
- ☆75Updated 8 months ago
- ☆82Updated 5 months ago
- A comprehensive, production-ready framework for building intelligent AI agents with advanced capabilities including tool calling, persist…☆163Updated 5 months ago
- Pop up some dialog box to your friend's screen like malwares !☆79Updated 6 months ago
- The project for General Adversarial Defense Against Black-box Attacks via Pixel Level and Feature Level Distribution Alignments.☆81Updated 3 years ago
- The code to implement the distributional reinforcement learning algorithm.☆60Updated 5 months ago
- Polyomino:Mapping cell locations via multi-layer regionalization constraints☆36Updated 2 months ago
- The 1st dynamic phishing kit dataset☆202Updated last year
- ☆121Updated 7 months ago
- A light-weight framework for building llm agentic systems with additional supports for program synthesis and neural-symbolic research.☆89Updated 2 months ago
- MAX31855 full-featured driver library for general-purpose MCU and Linux.☆70Updated 3 months ago
- ☆121Updated last month
- A modern web application for the Melbourne University Ultimate Frisbee Club, built with Next.js 15, TypeScript, and Tailwind CSS. This pl…☆101Updated 6 months ago
- [COLM 2025] Assessing Judging Bias in Large Reasoning Models: An Empirical Study https://openreview.net/pdf?id=SlRtFwBdzP☆163Updated 4 months ago
- Spring项目:支持设置时间、价格、距离权重的个性化导航服务,并支持根据大量用户行驶状态更新道路情况和预计到达时间☆22Updated 9 months ago
- Create Customized Software using Natural Language Idea (through LLM-powered Multi-Agent Collaboration)☆41Updated last year
- TikTok emojis component library monorepo. Contains React and Vue 3 packages with 46 secret TikTok emojis (smile, happy, angry, etc.) usin…☆202Updated 6 months ago