A comparison of deepseek grpo and qwen gspo on Qwen2.5-1.5B-Instruct fine tunning.
☆170Mar 28, 2026Updated last month
Alternatives and similar repositories for grpo_reproduce
Users that are interested in grpo_reproduce are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆105Jul 24, 2025Updated 9 months ago
- ☆14Oct 19, 2025Updated 7 months ago
- ⚡️A collection of awesome things regarding Vuforia Augmented Reality SDK. Feel free to contribute!☆10Mar 12, 2018Updated 8 years ago
- A Foundation Language Model For Multilayer Regulation of RNA☆22Nov 30, 2025Updated 5 months ago
- Attentional Mechanism incorporated in Asynchronous Advantage Actor Critic a3c/a2c deep mind☆10Jan 9, 2018Updated 8 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Distributed DRL by Ray and TensorFlow Tutorial.☆10Dec 26, 2019Updated 6 years ago
- This is the official repository for the ICLR 2025 Conference Paper - Fast and Slow Streams for Online Time Series Forecasting without Inf…☆18Apr 30, 2025Updated last year
- daVinci-Agency: Unlocking Long-Horizon Agency Data-Efficiently☆39Feb 4, 2026Updated 3 months ago
- ☆10Sep 20, 2018Updated 7 years ago
- Towards Training-free Open-world Segmentation via Image Prompt Foundation Models,☆18Nov 22, 2024Updated last year
- ☆159Mar 18, 2026Updated 2 months ago
- 百度UIE抽取模型torch版训练预测框架☆12Nov 20, 2024Updated last year
- AIS 2024 Challenge, Real-Time 4K Super-Resolution of Compressed AVIF Images (Runner-Up Award in Track: Fidelity PSNR), Team XJTU-AIR☆22Jul 23, 2024Updated last year
- ☆30Jan 22, 2025Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆35Oct 23, 2025Updated 6 months ago
- Accelerating GOT-OCRv2 with VLLM☆10Nov 15, 2024Updated last year
- Demo app with Loguru logging, async middleware to generate X-request-Id. Works with Gunicorn or Uvicorn, and is safe to use with async/th…☆10Feb 2, 2022Updated 4 years ago
- 中华药典RAG项目☆10Oct 26, 2024Updated last year
- Official Code for EMNLP 2023 paper: "Unveiling the Implicit Toxicity in Large Language Models""☆15Nov 30, 2023Updated 2 years ago
- An unofficial implementation using Pytorch for "Anomaly Clustering: Grouping Images into Coherent Clusters of Anomaly Types". Improve the…☆18Nov 17, 2023Updated 2 years ago
- Cochlear.ai submission for dcase2018 task2☆15Sep 14, 2018Updated 7 years ago
- The official github repo for the open online courses: "Dive into LLMs".☆11Mar 15, 2024Updated 2 years ago
- 条件随机场(CRF)的pytorch实现☆10Mar 7, 2021Updated 5 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- 复现大模型相关算法及一些学习记录☆3,364Mar 21, 2026Updated 2 months ago
- Code release of paper "FAITH: Frequency-domain Attention In Two Horizons for Long-term time series forecasting"☆19Jun 20, 2025Updated 11 months ago
- Surrey CVSSP DCASE 2018 Task 2 system☆20Dec 26, 2022Updated 3 years ago
- ☆13May 12, 2025Updated last year
- 100行解决中文模糊实体识别with字典树和编辑距离 Chinese fuzzy entity matching with prefix tree and distance editing☆11Sep 25, 2023Updated 2 years ago
- ☆10Apr 30, 2025Updated last year
- 山东省第二届数据应用创新创业大赛-主赛场-检验报告单识别-Baseline☆13Jan 15, 2021Updated 5 years ago
- 我的人工智能学习路线:数学基础、机器学习、深度学习、Python、图像处理、计算机视觉☆20Jan 9, 2019Updated 7 years ago
- A curated collection of projects, benchmarks, and research papers focused on reproducing and advancing the DeepSeek R1 framework.☆15Mar 19, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A code reimplementation of DeepMind's "Multiagent Cooperation and Competition with Deep Reinforcement Learning" with Tensorflow☆15Apr 27, 2018Updated 8 years ago
- 将自己的数据集转换为coco 格式☆22Aug 8, 2019Updated 6 years ago
- A very simple GRPO implement for reproducing r1-like LLM thinking.☆1,672Nov 21, 2025Updated 6 months ago
- Local DeepSearch (Advantage: Low Threshold): an implementation of Agentic RAG based on DeepSeek-R1 API and Tavily API☆17Jun 21, 2025Updated 11 months ago
- Official code for the CVPR 2024 Paper "Can Biases in ImageNet Models Explain Generalization?".☆13Jun 24, 2024Updated last year
- Using self-play, MCTS, and a deep neural network to create a hearthstone ai player☆30Nov 1, 2018Updated 7 years ago
- ☆16Aug 31, 2025Updated 8 months ago