Policy Optimization is awesome, let’s put a tree on it! 🌲🌟
☆22Jul 4, 2025Updated 8 months ago
Alternatives and similar repositories for MCTS-GRPO
Users that are interested in MCTS-GRPO are comparing it to the libraries listed below
Sorting:
- Command helper for slurm system. Act as if you are on compute node.☆15Feb 1, 2025Updated last year
- STREET: a multi-task and multi-step reasoning dataset☆26Feb 28, 2024Updated 2 years ago
- Unsupervised Natural Language Parsing (Tutorial)☆22Apr 19, 2021Updated 4 years ago
- ☆15Feb 10, 2025Updated last year
- This is the repository of the EnviroDetaNet☆13Sep 3, 2024Updated last year
- 微信视频号推荐比赛相关☆10Nov 4, 2021Updated 4 years ago
- Few-Shot Relation Extraction with AllenNLP☆13Jan 27, 2019Updated 7 years ago
- [AAAI 2025] Official Implementation of "HDT: Hierarchical Discrete Transformer for Multivariate Time Series Forecasting"☆16Feb 17, 2025Updated last year
- ☆22Feb 3, 2026Updated last month
- 基于苏剑林项目的复用,应用于金融事件关系抽取☆10Mar 26, 2021Updated 4 years ago
- The official repository of MM-R5☆28Jun 22, 2025Updated 8 months ago
- ☆13Updated this week
- Code of "Instruction Multi-Constraint Molecular Generation Using a Teacher-Student Large Language Model"☆14Jul 8, 2025Updated 7 months ago
- Automatically Update LLM Papers Daily using Github Actions. Ref: https://github.com/Vincentqyw/cv-arxiv-daily☆10Updated this week
- 自研跨平台远程桌面控制软件☆13Feb 19, 2024Updated 2 years ago
- ShanghaiTech SI140A Probability & Statistics for EECS, Spring 2023, Spring 2024.☆24Feb 15, 2026Updated 2 weeks ago
- Cheminformatic analysis of small molecule type drugs in DrugBank for their ability to form nanoparticles with indocyanine dyes.☆11Apr 30, 2018Updated 7 years ago
- 个人的 Neovim 配置(基于 LazyVim)☆12Updated this week
- ☆12Jul 2, 2025Updated 8 months ago
- CRNN with Self-Attention☆10Apr 8, 2018Updated 7 years ago
- ☆10Jun 16, 2021Updated 4 years ago
- ☆15Jul 25, 2024Updated last year
- DevOps learning☆10Jan 10, 2020Updated 6 years ago
- ☆11Nov 16, 2019Updated 6 years ago
- ☆16Oct 16, 2024Updated last year
- A collection of deep reinforcement learning-based & GFlowNet drug molecule generators focused on generation of molecules using Graphs/SEL…☆10Dec 11, 2022Updated 3 years ago
- 保存(原)东京工业大学IGP群的资料☆15Oct 10, 2024Updated last year
- ☆18Apr 20, 2025Updated 10 months ago
- ☆14Feb 20, 2024Updated 2 years ago
- [NeurIPS 24] Can LLMs Solve Molecule Puzzles? A Multimodal Benchmark for Molecular Structure Elucidation☆18Jan 2, 2026Updated 2 months ago
- 利用pytorch实现的wide&deep,并利用avazu数据集进行了验证☆11Feb 4, 2021Updated 5 years ago
- ☆14Nov 27, 2021Updated 4 years ago
- CS194-196 Course Project☆14Feb 20, 2025Updated last year
- Code for the ACL 2021 paper "Structural Guidance for Transformer Language Models"☆13Sep 17, 2025Updated 5 months ago
- 基于bert的文本情感分析☆12Nov 4, 2022Updated 3 years ago
- DrugGen: Advancing Drug Discovery with Large Language Models and Reinforcement Learning Feedback☆21May 22, 2025Updated 9 months ago
- This is the official code for the paper 'Systematically Exploring Redundancy Reduction inSummarizing Long Documents'.☆16Apr 30, 2021Updated 4 years ago
- Use pytorch the right way http://pytorch.org/docs/☆14Nov 1, 2017Updated 8 years ago
- TimelyRec (Learning Heterogeneous Temporal Patterns of User Preference for Timely Recommendation, WWW'21)☆13Jul 2, 2021Updated 4 years ago