简单易理解的代码,用于在qwen上使用grpo加强数学能力
☆50May 14, 2025Updated 9 months ago
Alternatives and similar repositories for qwen_grpo_gsm8k
Users that are interested in qwen_grpo_gsm8k are comparing it to the libraries listed below
Sorting:
- Agent that converts natural language queries into SQL and provides response and query created☆55May 28, 2025Updated 9 months ago
- [ICLR 2025] Official PyTorch Implementation for CPE: Concept Pinpoint Eraser for Text-to-image Diffusion Models via Residual Attention Ga…☆12Apr 7, 2025Updated 11 months ago
- [ICRA 2024] WLST: Weak Labels Guided Self-training for Weakly-supervised Domain Adaptation on 3D Object Detection☆12Feb 6, 2024Updated 2 years ago
- OpenVLA Lightweight Version(0.5B). It uses qwen2-0.5B and fine-tunes using mllm format, without occupying LLM's inherent tokens. It repre…☆16Jan 7, 2026Updated 2 months ago
- Official Repo For AAAI 2026 Accepted Paper "Rethinking the Spatio-Temporal Alignment of End-to-End 3D Perception"☆29Jan 13, 2026Updated last month
- The AI Algorithm Proposed by The Master's Degree Thesis of Shengyuan Yan of Wuhan University School of Computer Science☆14May 11, 2022Updated 3 years ago
- Implementation of the CVPR2025 paper LoTUS: Large-Scale Machine Unlearning with a Taste of Uncertainty.☆17Sep 10, 2025Updated 6 months ago
- RLCar Gazebo v2☆12Jun 28, 2024Updated last year
- ☆13May 11, 2022Updated 3 years ago
- 中文语料:大量人工标注样本,非常有价值 !!!☆11Aug 15, 2019Updated 6 years ago
- GUIEvalKit: Open-source Evaluation Toolkit for GUI Agents☆19Feb 26, 2026Updated last week
- Final Project of ME5413 Autonomous Mobile Robotics @ NUS☆10Oct 13, 2023Updated 2 years ago
- Documentation at☆14Mar 27, 2025Updated 11 months ago
- ☆11Aug 9, 2018Updated 7 years ago
- sgbm立体匹配算法以及生成点云☆12Jan 29, 2021Updated 5 years ago
- ☆11Dec 24, 2024Updated last year
- Autonomous navigation simulation of an agricultural robot during soil fertilization in open fields using ROS and Gazebo.☆10Apr 8, 2025Updated 11 months ago
- 🔥(ECCV 2024 Oral) RAPiD-Seg: Range-Aware Pointwise Distance Distribution Networks for 3D LiDAR Segmentation☆48Sep 2, 2025Updated 6 months ago
- Repository with all source files relating to the 6CCE3EEP Final Year Project titled "Self Parking with Reinforcement Learning." The proje…☆10Jul 20, 2023Updated 2 years ago
- Deep Introspective SLAM: Deep Reinforcement Learning based Approach to Avoid Tracking Failure in Visual SLAM☆11Jul 31, 2021Updated 4 years ago
- [NeurIPS 2025] Encoder-Decoder Diffusion Language Models for Efficient Training and Inference☆36Oct 29, 2025Updated 4 months ago
- ☆12Jun 27, 2022Updated 3 years ago
- Efficient Decoupled Feature 3D Gaussian Splatting via Hierarchical Compression☆13Mar 17, 2025Updated 11 months ago
- Official PyTorch code for "Vector Quantization Prompting for Continual Learning (NeurIPS2024)".☆10Oct 16, 2024Updated last year
- Extended Implementation of FastLGS☆16Dec 17, 2024Updated last year
- 使用ROS2+RL 的循迹小车☆13Aug 30, 2024Updated last year
- mcp server for robot and automations☆12Feb 27, 2025Updated last year
- Public Codebase supporting the paper "Modeling Cellular Perturbations with The Sparse Additive Mechanism Shift Variational Autoencoder" b…☆14Oct 20, 2023Updated 2 years ago
- Towards Target-Driven Visual Navigation in Indoor Scenes via Generative Imitation Learning☆12Dec 20, 2020Updated 5 years ago
- 线性回归;病态线性回归;聚类分析; 主成分分析 ; 多目标决策☆11Nov 24, 2021Updated 4 years ago
- FunASR安卓端侧离线版本2pass全模式☆14Sep 4, 2023Updated 2 years ago
- Generalizable Stable Points Segmentation for 3D LiDAR Scan-to-Map Long-Term Localization☆17Jun 3, 2024Updated last year
- Official Pytorch implementation of the AAAI 2025 "Spiking Point Transformer for Point Cloud Classification"☆15Apr 12, 2025Updated 10 months ago
- 带拼音、字形特征的文本纠错模型☆11Jan 1, 2023Updated 3 years ago
- [ECCV 2024] Online Continuous Generalized Category Discovery☆14Oct 6, 2024Updated last year
- 基于wenet的短时在线语音识别服务☆11Feb 25, 2023Updated 3 years ago
- [NeurIPS 23] Characterizing OOD Error via Optimal Transport☆13Nov 19, 2023Updated 2 years ago
- The official implementation of Diffusion Distillation With Direct Preference Optimization For Efficient 3D LiDAR Scene Completion [AAAI'2…☆15Feb 2, 2026Updated last month
- Look Back to Reason Forward: Revisitable Memory for Long-Context LLM Agents☆23Feb 21, 2026Updated 2 weeks ago