dhcode-cpp/grpo-loss

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/dhcode-cpp/grpo-loss)

dhcode-cpp / grpo-loss

☆44

Alternatives and similar repositories for grpo-loss

Users that are interested in grpo-loss are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

linjh1118 / Chinese_Awesome_CV
View on GitHub
Awesome_CV的中文版本，clone本项目到overleaf即可轻松愉快编写自己的CV
☆18May 24, 2024Updated 2 years ago
linjh1118 / Awesome-MLLM-For-Games
View on GitHub
MLLM @ Game
☆17May 12, 2025Updated last year
dhcode-cpp / online-softmax
View on GitHub
simplest online-softmax notebook for explain Flash Attention
☆18Jan 27, 2026Updated 5 months ago
linjh1118 / survey_agent
View on GitHub
☆17Jan 14, 2026Updated 6 months ago
chenzen94 / debug-deepspeed-chat
View on GitHub
Debug DeepSpeed-Chat step by step in IDE (在IDE里一步一步调试DeepSpeed-Chat)
☆10Apr 17, 2023Updated 3 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
bojone / softtopk
View on GitHub
differentiable top-k operator
☆23Dec 30, 2024Updated last year
linjh1118 / WisdoMentor
View on GitHub
WisdoMentor - Series: A LLM for undergraduates | 博导智言(辅助大学生学习)
☆13May 9, 2024Updated 2 years ago
linjh1118 / Llama3-Chinese-ORPO
View on GitHub
基于Llama3，通过进一步CPT，SFT，ORPO得到的中文版Llama3
☆16Apr 24, 2024Updated 2 years ago
icip-cas / LiteCoder
View on GitHub
Advancing Small and Medium-sized Code Agents.
☆17May 29, 2026Updated last month
LindgeW / MetaAug4NER
View on GitHub
Robust Self-augmentation for NER with Meta-reweighting
☆29Nov 8, 2022Updated 3 years ago
DawnEver / mcm-icm-typst-template
View on GitHub
☆11Jan 29, 2026Updated 5 months ago
YChen1993 / ICLRec
View on GitHub
Intent Contrastive Learning for Sequential Recommendation (WWW'22)
☆16Mar 16, 2022Updated 4 years ago
Xiaohan-Chen / eat_pytorch_in_20_days
View on GitHub
Pytorch🍊🍉 is delicious, just eat it! 😋😋
☆10Feb 13, 2026Updated 5 months ago
Bollegala / DARep
View on GitHub
Cross-domain word representation learning
☆10May 23, 2015Updated 11 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
dhcode-cpp / X-R1
View on GitHub
minimal-cost for training 0.5B R1-Zero
☆816May 14, 2025Updated last year
ZzYUuuu / LightCCF
View on GitHub
[SIGIR25] Unveiling Contrastive Learning‘ Capability of Neighborhood Aggregation for Collaborative Filtering
☆18Jul 22, 2025Updated 11 months ago
Infini-AI-Lab / astraflow
View on GitHub
Dataflow-Oriented Reinforcement Learning for (Multi-)Agentic LLMs
☆95Jul 14, 2026Updated last week
OpenMatch / SANTA
View on GitHub
☆12Jul 13, 2023Updated 3 years ago
GAIR-NLP / LIMOPro
View on GitHub
☆15May 27, 2025Updated last year
Somebodynew-pw / Universal-safety-announcement-
View on GitHub
This announcement is used in the ATMHUFK's video. The original is from the another up,Which is called 原无奇变in Chinese.You can use it to av…
☆10Jan 26, 2025Updated last year
zhangir-azerbayev / MetaMath
View on GitHub
☆11Oct 11, 2023Updated 2 years ago
sanowl / CoRAG
View on GitHub
this is based on the paper Chain-of-Retrieval Augmented Generation
☆15Mar 29, 2025Updated last year
YangLing0818 / SuperCorrect-llm
View on GitHub
[ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction
☆90Mar 23, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Lee-CBG / ATM-TCR
View on GitHub
☆12Mar 22, 2024Updated 2 years ago
jefferyYu / EMNLP18_codes
View on GitHub
☆11Apr 29, 2019Updated 7 years ago
shaohao011 / MedCCO
View on GitHub
[ACM MM2026] This is the official implementation of MedCCO
☆17Jul 12, 2026Updated last week
cx0 / geneformer-finetune
View on GitHub
☆13Jun 4, 2023Updated 3 years ago
zc277584121 / akcio
View on GitHub
Akcio is a demonstration project for Retrieval Augmented Generation (RAG). It leverages the power of LLM to generate responses and uses v…
☆12Oct 30, 2023Updated 2 years ago
Dielianss / Chinese-BERT-KPE
View on GitHub
☆10Apr 6, 2022Updated 4 years ago
HanSolo9682 / CounterCurate
View on GitHub
This is the implementation of CounterCurate, the data curation pipeline of both physical and semantic counterfactual image-caption pairs.
☆19Jun 27, 2024Updated 2 years ago
ictnlp / LSG
View on GitHub
The code for AAAI 2025 “Large Language Models Are Read/Write Policy-Makers for Simultaneous Generation”
☆15Jan 3, 2025Updated last year
aladinD / SafeMERGE
View on GitHub
Code for SafeMERGE (ICLR 2025).
☆15Apr 1, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
ANRGUSC / covid19_risk_estimation
View on GitHub
COVID-19 Risk Estimation for L.A. County using a Bayesian Time-varying SIR-model
☆12Feb 17, 2023Updated 3 years ago
chang-github-00 / LLM-Predictive-Decoding
View on GitHub
☆16Jul 9, 2025Updated last year
yyDing1 / ScaleQuest
View on GitHub
[ACL 2025] We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLM…
☆69Oct 27, 2024Updated last year
xhy963319431 / WEARec
View on GitHub
Wavelet Enhanced Adaptive Frequency Filter for Sequential Recommendation, AAAI-26
☆24Apr 10, 2026Updated 3 months ago
lemurproject / ClueWeb22
View on GitHub
☆17Dec 11, 2024Updated last year
yyysjz1997 / Awesome-AudioVision-Multimodal
View on GitHub
A list of current Audio-Vision Multimodal with awesome resources (paper, application, data, review, survey, etc.).
☆34Oct 11, 2023Updated 2 years ago
PasaLab / NAS-CTR
View on GitHub
☆13Oct 31, 2022Updated 3 years ago