wantbook-book / SeRLView external linksLinks
SeRL: Self-Play Reinforcement Learning for Large Language Models with Limited Data
☆21Jan 24, 2026Updated 3 weeks ago
Alternatives and similar repositories for SeRL
Users that are interested in SeRL are comparing it to the libraries listed below
Sorting:
- This repository contains the code and pre-trained models for our paper☆21Jun 29, 2025Updated 7 months ago
- Transformer Doctor: Diagnosing and Treating Vision Transformers☆11Jan 15, 2025Updated last year
- ☆32Oct 4, 2025Updated 4 months ago
- [IEEE Transactions on Power Systems] Transmission Interface Power Flow Adjustment: A Deep Reinforcement Learning Approach based on Multi-…☆24Jun 2, 2024Updated last year
- Model LEGO: Creating Models Like Disassembling and Assembling Building Blocks☆17Jan 15, 2025Updated last year
- Model Doctor: A Simple Gradient Aggregation Strategy for Diagnosing and Treating CNN Classifiers [https://arxiv.org/pdf/2112.04934.pdf]☆15May 13, 2023Updated 2 years ago
- A Survey of Direct Preference Optimization (DPO)☆91Jul 4, 2025Updated 7 months ago
- [TPAMI] Interaction Pattern Disentangling for Multi-Agent Reinforcement Learning☆33May 17, 2024Updated last year
- [SIGKDD' 24] PyTorch implementation of Temporal Prototype-Aware Learning for Active Voltage Control on Power Distribution Networks☆13Jul 28, 2024Updated last year
- ☆10Jul 13, 2024Updated last year
- A benchmark of Python Library Migration☆14Apr 5, 2025Updated 10 months ago
- ☆20Aug 8, 2025Updated 6 months ago
- Our repo containes a Efficient RGB-D features extractor to category-level and instance-level 6D pose estimation.☆14Oct 29, 2025Updated 3 months ago
- vscode-translation 翻译插件☆10Mar 3, 2022Updated 3 years ago
- ☆14Jun 15, 2023Updated 2 years ago
- Official implementation of SPGrasp: A framework for dynamic grasp synthesis from sparse spatiotemporal prompts.☆19Jan 6, 2026Updated last month
- ☆13Aug 4, 2025Updated 6 months ago
- OSWorld-Human: Benchmarking the Efficiency of Computer-Use Agents☆21Jan 6, 2026Updated last month
- PyTorch Implementation for the paper "Let Me Help You! Neuro-Symbolic Short-Context Action Anticipation" accepted to RA-L'24.☆12Nov 27, 2024Updated last year
- TESGNN: 3D Temporal Equivariant Scene Graph Neural Networks (published at TMLR)☆14Nov 2, 2025Updated 3 months ago
- ☆10Oct 25, 2024Updated last year
- ☆17May 3, 2025Updated 9 months ago
- Reconsidering the Performance of GAE in Link Prediction☆16Jan 12, 2026Updated last month
- The official implementation of the paper "Self-Updatable Large Language Models by Integrating Context into Model Parameters"☆15May 18, 2025Updated 8 months ago
- [Pattern Recognition, 2020] Covariance Descriptors on a Gaussian Manifold and their Application to Image Set Classification☆12May 28, 2022Updated 3 years ago
- [RAL 2025] MTIL: Encoding Full History with Mamba for Temporal Imitation Learning☆27Nov 17, 2025Updated 2 months ago
- [L4DC 2025] Morphological-Symmetry-Equvariant Heterogeneous Graph Neural Network for Robotic Dynamics Learning☆18Dec 6, 2025Updated 2 months ago
- ManifoldNet Paper Implementation for SPD(n)☆11Nov 10, 2021Updated 4 years ago
- Initial commit☆12Aug 14, 2023Updated 2 years ago
- A MacOS OCR Native Node.js Module☆19Oct 11, 2025Updated 4 months ago
- The dataset and codes of the paper UniMod1K: Towards a More Universal Large-Scale Dataset and Benchmark for Multi-Modal Learning.☆16Sep 21, 2025Updated 4 months ago
- The public reproducible analysis code used for the gaze project☆11Dec 26, 2025Updated last month
- ReSemAct: Advancing Fine-Grained Robotic Manipulation via Semantic Structuring and Affordance Refinement☆17Jan 5, 2026Updated last month
- 基于电商导购机器人,自然语言理解(NLU),文本纠错,歧义词消歧☆12May 5, 2020Updated 5 years ago
- [2023 CoRL] Leveraging 3D Reconstruction for Mechanical Search on Cluttered Shelves☆11Dec 12, 2024Updated last year
- [NeurIPS 2024 poster] Cross-model Control: Improving Multiple Large Language Models in One-time Training☆14Oct 25, 2024Updated last year
- [ACL 2025 Main] (🏆 Outstanding Paper Award) Rethinking the Role of Prompting Strategies in LLM Test-Time Scaling: A Perspective of Proba…☆15Aug 15, 2025Updated 6 months ago
- SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning☆175Sep 18, 2025Updated 4 months ago
- ZeroGrasp: Zero-Shot Shape Reconstruction Enabled Robotic Grasping [CVPR 2025]☆68Aug 17, 2025Updated 5 months ago