☆85Feb 3, 2025Updated last year
Alternatives and similar repositories for DeepSeek-RL-Qwen-0.5B-GRPO-gsm8k
Users that are interested in DeepSeek-RL-Qwen-0.5B-GRPO-gsm8k are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 零实现 AlphaGo Zero☆17Nov 10, 2024Updated last year
- A demonstration of how to train a custom tokenizer similar to TikToken.☆15Jan 6, 2025Updated last year
- 汇编语言学习的例子☆10Aug 5, 2021Updated 4 years ago
- ☆12Dec 22, 2025Updated 4 months ago
- [EMNLP 2023] Knowledge Rumination for Pre-trained Language Models☆17Jun 29, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Use pretrained BERT model to automatically generate grammar multiple choice questions (MCQ) from any news article or story.☆13Oct 2, 2019Updated 6 years ago
- This repo is the artifact of FUEL☆15Apr 24, 2026Updated last week
- (NBCE)Naive Bayes-based Context Extension on ChatGLM-6b☆15Jun 7, 2023Updated 2 years ago
- [ACL'24 Findings] Official code for "TLCR: Token-Level Continuous Reward for Fine-grained Reinforcement Learning from Human Feedback"☆12Dec 6, 2024Updated last year
- ☆26Mar 21, 2024Updated 2 years ago
- [AAAI 2026] Multimodal Deepresearcher: Generating Text-Chart Interleaved Reports From Scratch with Agentic Framework☆53Jan 25, 2026Updated 3 months ago
- 补充了一些Visualglm缺少的文件,可以对Visualglm进行训练,实例中是对人脸做了面相的识别☆13Jun 7, 2023Updated 2 years ago
- 《Reinforcement Learning》读书学习与视频分享笔记☆79Apr 1, 2025Updated last year
- Load Tensorflow pb file using Bert/TextCNNs, an ensemble model using Java.☆10Aug 20, 2021Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- 2024CCF国际AIOps挑战赛-赛道二(GLM4):基于检索增强的运维知识问答挑战赛解决方案分享。☆14Jul 5, 2024Updated last year
- Pytorch Implementation of DECOLLAGE (ECCV 2024)☆11Oct 21, 2024Updated last year
- Graph QABot Demo| 图谱问答案例☆15Apr 11, 2023Updated 3 years ago
- Code for paper 'Are We Falling in a Middle-Intelligence Trap? An Analysis and Mitigation of the Reversal Curse'☆13Aug 2, 2024Updated last year
- bilibili视频讲解所使用的课件代码记录☆38Mar 24, 2026Updated last month
- [CIKM 2025] LLMAEL: Large Language Models are Good Context Augmenters for Entity Linking☆18Sep 6, 2025Updated 7 months ago
- 微调阿里开源的文字检测模型,利用合合识别返回的OCR结果作为初始训练数据,对模型进行优化训练,使其更加适应1万张图片的具体场景,提高文字识别的精度。☆10Dec 9, 2024Updated last year
- Courses in UCAS☆14Jun 12, 2023Updated 2 years ago
- 接入连连服务时可能会使用到的资源☆16Apr 25, 2022Updated 4 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Client library for the Fourier GRx series robot☆15May 16, 2025Updated 11 months ago
- 中文纠错-使用拼音树及编辑距离☆13Jul 19, 2019Updated 6 years ago
- Using Siamese LSTM to classify repeated quora questions. Attempted pretrained bert embeddings, Word2Vec and training own embeddings toget…☆10Aug 28, 2020Updated 5 years ago
- Official repository for ToolScope: An Agentic Framework for Vision-Guided and Long-Horizon Tool Use☆30Nov 4, 2025Updated 5 months ago
- Awesome Long-CoT Data☆20Mar 26, 2025Updated last year
- Official repo for "ProSec: Fortifying Code LLMs with Proactive Security Alignment"☆17Feb 26, 2026Updated 2 months ago
- Backdooring Neural Code Search☆14Sep 8, 2023Updated 2 years ago
- Additional material for the malware variants identification paper☆10Sep 6, 2019Updated 6 years ago
- ☆22Jul 15, 2024Updated last year
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆33May 9, 2025Updated 11 months ago
- Distributed simulation system based on muduo☆12Feb 17, 2023Updated 3 years ago
- I-SHEEP: Iterative Self-enHancEmEnt Paradigm of LLMs through Self-Instruct and Self-Assessment☆17Jan 16, 2025Updated last year
- Vision Transformer-Inspired Automated Vulnerability Repair☆19May 13, 2025Updated 11 months ago
- 检查实验报告内容的相似度。 实验报告以word文档形式存在,doc或docx为扩展名。 使用simhash算法检测。☆13May 24, 2018Updated 7 years ago
- Official implementation of UnifiedReward & UnifiedReward-Think☆18Jun 18, 2025Updated 10 months ago
- ☆21Aug 23, 2024Updated last year