1ring2rta / MCTS-GRPOView external linksLinks
Policy Optimization is awesome, let’s put a tree on it! 🌲🌟
☆22Jul 4, 2025Updated 7 months ago
Alternatives and similar repositories for MCTS-GRPO
Users that are interested in MCTS-GRPO are comparing it to the libraries listed below
Sorting:
- Command helper for slurm system. Act as if you are on compute node.☆15Feb 1, 2025Updated last year
- STREET: a multi-task and multi-step reasoning dataset☆25Feb 28, 2024Updated last year
- Unsupervised Natural Language Parsing (Tutorial)☆22Apr 19, 2021Updated 4 years ago
- ☆15Feb 10, 2025Updated last year
- [AAAI 2025] Official Implementation of "HDT: Hierarchical Discrete Transformer for Multivariate Time Series Forecasting"☆16Feb 17, 2025Updated 11 months ago
- ShanghaiTech SI140A Probability & Statistics for EECS, Spring 2023, Spring 2024.☆24Apr 16, 2025Updated 9 months ago
- 基于苏剑林项目的复用,应用于金融事件关系抽取☆10Mar 26, 2021Updated 4 years ago
- 个人的 Neovim 配置(基于 LazyVim)☆12Updated this week
- The official repository of MM-R5☆28Jun 22, 2025Updated 7 months ago
- Cheminformatic analysis of small molecule type drugs in DrugBank for their ability to form nanoparticles with indocyanine dyes.☆11Apr 30, 2018Updated 7 years ago
- Code of "Instruction Multi-Constraint Molecular Generation Using a Teacher-Student Large Language Model"☆14Jul 8, 2025Updated 7 months ago
- ☆13Jun 29, 2025Updated 7 months ago
- Automatically Update LLM Papers Daily using Github Actions. Ref: https://github.com/Vincentqyw/cv-arxiv-daily☆10Updated this week
- 自研跨平台远程桌面控制软件☆13Feb 19, 2024Updated last year
- ☆23Feb 3, 2026Updated last week
- Few-Shot Relation Extraction with AllenNLP☆13Jan 27, 2019Updated 7 years ago
- CRNN with Self-Attention☆10Apr 8, 2018Updated 7 years ago
- ☆14Jul 25, 2024Updated last year
- DevOps learning☆10Jan 10, 2020Updated 6 years ago
- ☆10Jun 16, 2021Updated 4 years ago
- ☆11Nov 16, 2019Updated 6 years ago
- ☆12Jul 2, 2025Updated 7 months ago
- 保存(原)东京工业大学IGP群的资料☆15Oct 10, 2024Updated last year
- ☆15Oct 16, 2024Updated last year
- A collection of deep reinforcement learning-based & GFlowNet drug molecule generators focused on generation of molecules using Graphs/SEL…☆10Dec 11, 2022Updated 3 years ago
- [NeurIPS 24] Can LLMs Solve Molecule Puzzles? A Multimodal Benchmark for Molecular Structure Elucidation☆17Jan 2, 2026Updated last month
- Code for the ACL 2021 paper "Structural Guidance for Transformer Language Models"☆13Sep 17, 2025Updated 4 months ago
- ☆14Feb 20, 2024Updated last year
- ☆14Nov 27, 2021Updated 4 years ago
- CS194-196 Course Project☆14Feb 20, 2025Updated 11 months ago
- 利用pytorch实现的wide&deep,并利用avazu数据集进行了验证☆11Feb 4, 2021Updated 5 years ago
- ☆18Apr 20, 2025Updated 9 months ago
- 基于bert的文本情感分析☆12Nov 4, 2022Updated 3 years ago
- 2021腾讯广告算法大赛赛道二神奈川冲浪里(获奖排名第8)☆16May 3, 2022Updated 3 years ago
- Diffusion-based generative drug-like molecular editing with chemical natural language☆18Dec 22, 2024Updated last year
- Serializing molecule 3D structures☆14Nov 27, 2024Updated last year
- Code for the SofT-GRPO algorithm on the LLM soft-thinking reasoning pattern.☆38Jan 2, 2026Updated last month
- Course materials for introduction to web-based application development, fall 2017.☆14Dec 14, 2017Updated 8 years ago
- TimelyRec (Learning Heterogeneous Temporal Patterns of User Preference for Timely Recommendation, WWW'21)☆13Jul 2, 2021Updated 4 years ago