Mryangkaitong / deepseek-r1-gsm8kView external linksLinks
☆48Feb 10, 2025Updated last year
Alternatives and similar repositories for deepseek-r1-gsm8k
Users that are interested in deepseek-r1-gsm8k are comparing it to the libraries listed below
Sorting:
- Berkeley Function Calling Leaderboard (BFCL) with Chinese-Language Evaluation☆23Apr 6, 2025Updated 10 months ago
- This is the repository for the paper CSPRD: A Financial Policy Retrieval Dataset for Chinese Stock Market☆19Mar 8, 2024Updated last year
- [AAAI 2026 Oral] The official code of "UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning"☆62Dec 8, 2025Updated 2 months ago
- Train your grpo with zero dataset and low resources, 8bit/4bit/lora/qlora supported, multi-gpu supported ...☆79Apr 30, 2025Updated 9 months ago
- pytorch版unilm模型☆27Jun 19, 2021Updated 4 years ago
- This repository is aim to reproduce the R1-Zero on medical domain.☆32Jun 11, 2025Updated 8 months ago
- ☆38Feb 16, 2024Updated 2 years ago
- RL algorithm for stock trading with multiple reward functions☆11Apr 21, 2024Updated last year
- Implementation of the model from "Faster sorting algorithms discovered using deep reinforcement learning" that discovered an all-new ult…☆11Aug 29, 2023Updated 2 years ago
- Reproduce R1 Zero on Logic Puzzle☆2,435Mar 20, 2025Updated 10 months ago
- This project is focus on stock prediction,our goal is implementing one trading framework using DRL with LSTM.☆11Jun 1, 2018Updated 7 years ago
- Policy Optimization is awesome, let’s put a tree on it! 🌲🌟☆22Jul 4, 2025Updated 7 months ago
- ☆45Nov 20, 2025Updated 2 months ago
- FinanceGPT-B☆10Mar 26, 2024Updated last year
- ☆10Jul 21, 2019Updated 6 years ago
- Open Source Tsetlin Machine framework☆17Oct 15, 2018Updated 7 years ago
- ☆47Apr 9, 2025Updated 10 months ago
- In the high-frequency era of trading, orders of stocks can be executed under a millsecond. The information about the thousands of orders …☆10Mar 30, 2016Updated 9 years ago
- Mac port of Torcs, The Open Racing Car Simulator☆11Jun 16, 2010Updated 15 years ago
- Win + D for One Monitor (Show Desktop only for One Monitor)☆10Dec 15, 2022Updated 3 years ago
- High-quality reference implementations of various algorithms for Inverse Reinforcement Learning☆13Jun 20, 2018Updated 7 years ago
- todo: desc☆11Aug 12, 2021Updated 4 years ago
- Professional Wargaming LLM Toolbox☆20Jul 9, 2025Updated 7 months ago
- Implementation about a recommender System using RQ-VAE Semantic IDs☆16Aug 11, 2025Updated 6 months ago
- Bidirectional Likelihood Estimation with Multi-Modal Large Language Models for Text-Video Retrieval (ICCV 2025 Highlight)☆20Aug 1, 2025Updated 6 months ago
- Predicting the Short-term Direction of Futures Contracts through Machine Learning☆14Oct 15, 2024Updated last year
- Gathers machine learning and deep learning models for Reinforcement Learning☆10Sep 8, 2018Updated 7 years ago
- Reinforcement learning crypto trading bot☆10Oct 30, 2020Updated 5 years ago
- ☆13May 25, 2023Updated 2 years ago
- Ground-Aware Point Cloud Semantic Segmentation for Autonomous Driving. ACM Multimedia 2019.☆12Sep 19, 2019Updated 6 years ago
- Persistent dense gemm for Hopper in `CuTeDSL`☆15Aug 9, 2025Updated 6 months ago
- ☆13Jan 5, 2022Updated 4 years ago
- Applying the Trading Deep Q-Network algorithm (TDQN) on shares in the hydrogen sector.☆11Nov 11, 2020Updated 5 years ago
- Keyscan: AI-powered API key scanner for GitHub Gists.☆28Jan 1, 2026Updated last month
- Created a simple neural network using C++17 standard and the Eigen library that supports both forward and backward propagation.☆10Jul 27, 2024Updated last year
- This is the official code repository for the paper "Decoding Global Preferences: Temporal and Cooperative Dependency Modeling in Multi-Ag…☆11Feb 6, 2025Updated last year
- ☆11Oct 6, 2020Updated 5 years ago
- Cheminformatic analysis of small molecule type drugs in DrugBank for their ability to form nanoparticles with indocyanine dyes.☆11Apr 30, 2018Updated 7 years ago
- Code of "Instruction Multi-Constraint Molecular Generation Using a Teacher-Student Large Language Model"☆14Jul 8, 2025Updated 7 months ago