简单易理解的代码,用于在qwen上使用grpo加强数学能力
☆57May 14, 2025Updated last year
Alternatives and similar repositories for qwen_grpo_gsm8k
Users that are interested in qwen_grpo_gsm8k are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This repository contains reference implementation for multi-LLM ToM paper (accepted to EMNLP 2023), Theory of Mind for Multi-Agent Collab…☆19Jun 11, 2024Updated last year
- Code release for "Generating Code World Models with Large Language Models Guided by Monte Carlo Tree Search" published at NeurIPS '24.☆18Feb 21, 2025Updated last year
- Generating global explanations from local ones☆11Nov 11, 2022Updated 3 years ago
- Multi-Scale Semantic Fusion-Guided Fractal Convolutional Object Detection Network for Optical Remote Sensing Imagery☆12Jul 17, 2022Updated 3 years ago
- LLM-MapBook: AI-Powered Maps for Storytelling. Extracts geo-coordinates from books, visualizes on interactive maps, offering immersive st…☆10Aug 27, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Fine-tuned MARL algorithms on SMAC (100% win rates on most scenarios)☆12Jul 29, 2023Updated 2 years ago
- This repository contains the ToolSelect dataset which was used to fine-tune Llama-2 70B for tool selection.☆22Mar 11, 2024Updated 2 years ago
- Large satellite image semantic segmentation into 6 classes using Tensorflow 2.0 and ISPRS benchmark dataset.☆18Mar 18, 2021Updated 5 years ago
- FreeSWITCH ASR module fork from mod_audio_stream, use FunASR online cpu version☆18Jun 27, 2025Updated 11 months ago
- [AAAI-25] Latent Reward: LLM-Empowered Credit Assignment in Episodic Reinforcement Learning.☆34May 29, 2025Updated last year
- A fast, lightweight Go-based CLI tool to detect and manage processes using network ports—featuring project awareness, Docker support, and…☆36Jun 5, 2025Updated last year
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.☆17May 16, 2025Updated last year
- Official TensorFlow implementation of Federated Learning of Generative Image Priors for MRI Reconstruction (FedGIMP)☆15Apr 3, 2022Updated 4 years ago
- Agent that converts natural language queries into SQL and provides response and query created☆69May 28, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- A reinforcement learning agent that learns to solve mazes using Group Relative Policy Optimization (GRPO).☆12Feb 9, 2025Updated last year
- Train no-reference speech quality estimators with multiple datasets via learned, per-dataset alignments.☆18Aug 1, 2025Updated 10 months ago
- FunASR安卓端侧离线版本2pass全模式☆15Sep 4, 2023Updated 2 years ago
- The official codes for paper "Deep hash learning for remote sensing image retrieval"☆21Nov 16, 2020Updated 5 years ago
- ☆15Oct 19, 2024Updated last year
- [EMNLP 2024 Main] Official implementation of the paper "Unveiling In-Context Learning: A Coordinate System to Understand Its Working Mech…☆16Oct 8, 2024Updated last year
- Collection of latest papers and materials in the area of RLVR!☆117Jun 1, 2026Updated last week
- MMTL-UniAD: A Unified Framework for Multimodal and Multi-Task Learning in Assistive Driving Perception☆32Sep 15, 2025Updated 8 months ago
- Urban Cup 2023☆16Aug 2, 2023Updated 2 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- 京东/淘宝客服对话数据公开,seq2seq生成模型设计对话系统获第二名☆44Dec 8, 2022Updated 3 years ago
- 带人工标注的中文灾害数据集,将会持续更新。☆16May 12, 2019Updated 7 years ago
- TensorRT☆11Sep 22, 2020Updated 5 years ago
- Pytorch Bayesian UNet model for segmentation and uncertainty prediction☆30Aug 18, 2022Updated 3 years ago
- ☆26Dec 13, 2021Updated 4 years ago
- 基于电商导购机器人,自 然语言理解(NLU),文本纠错,歧义词消歧☆12May 5, 2020Updated 6 years ago
- CTC decoder with hotwords for ASR.☆36Apr 13, 2025Updated last year
- ☆27Feb 26, 2023Updated 3 years ago
- Semantic Lidar Odometry☆12May 1, 2020Updated 6 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Table2answer: Read the database and answer without SQL https://arxiv.org/abs/1902.04260☆14May 11, 2021Updated 5 years ago
- Code for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"☆18Mar 15, 2024Updated 2 years ago
- ☆32Feb 4, 2025Updated last year
- 这是一个利用Spring Cloud,Dubbo,Thrift三个微服务框架整合开发的IM社交系统,并用到了Netty即时通讯技术,Tensorflow深度学习框架与Haar+Adaboost人脸识别技术,每个模块都可以被完整的被拿来直接使用,适合对微服务,即时通信感兴趣的…☆11Nov 16, 2022Updated 3 years ago
- mcp server for robot and automations☆12Mar 20, 2026Updated 2 months ago
- Official code of "Discover and Mitigate Unknown Biases with Debiasing Alternate Networks" (ECCV 2022)☆24Feb 15, 2023Updated 3 years ago
- The implementatin of our ECCV 2020 work: Targeted Attack for Deep Hashing based Retrieval.☆28Jun 7, 2021Updated 5 years ago