☆43Mar 6, 2025Updated last year
Alternatives and similar repositories for grpo-loss
Users that are interested in grpo-loss are comparing it to the libraries listed below
Sorting:
- simplest online-softmax notebook for explain Flash Attention☆16Jan 27, 2026Updated last month
- Debug DeepSpeed-Chat step by step in IDE (在IDE里一步一步调试DeepSpeed-Chat)☆10Apr 17, 2023Updated 2 years ago
- differentiable top-k operator☆22Dec 30, 2024Updated last year
- A GPT-powered AI auto scraper for websites. AI Web Scraping made easy.☆14Jun 26, 2023Updated 2 years ago
- DRL for WebRTC Control☆12Feb 3, 2024Updated 2 years ago
- This announcement is used in the ATMHUFK's video. The original is from the another up,Which is called 原无奇变in Chinese.You can use it to av…☆10Jan 26, 2025Updated last year
- this is based on the paper Chain-of-Retrieval Augmented Generation☆14Mar 29, 2025Updated 11 months ago
- Cross-domain word representation learning☆10May 23, 2015Updated 10 years ago
- ☆13Jul 13, 2023Updated 2 years ago
- ☆11Jan 29, 2026Updated last month
- 新网银行杯Top1方案☆23Dec 16, 2018Updated 7 years ago
- ☆13Nov 9, 2021Updated 4 years ago
- 使用numpy从零开始实现llama3的推理流程,并对其进行封装,对比GPU,CPU上的表现以及Lora微调。llama3 implemented from scratch using numpy and lora fine-tune.。☆12Jul 16, 2024Updated last year
- [EMNLP 2025 Findings] Familiarity-aware Evidence Compression for Retrieval Augmented Generation☆14Aug 20, 2025Updated 6 months ago
- Can VLMs understand students' hand-drawn math work?☆17Jan 20, 2026Updated 2 months ago
- ☆11Apr 29, 2019Updated 6 years ago
- An Ultra-Long Output Reinforcement Learning Approach☆23Jul 31, 2025Updated 7 months ago
- ☆13Jun 4, 2023Updated 2 years ago
- This is the implementation of CounterCurate, the data curation pipeline of both physical and semantic counterfactual image-caption pairs.☆19Jun 27, 2024Updated last year
- This project based on Particle Swarm Optimization Algorithm. Try to solve Mobile Edge Computing optimization problem.☆11Jun 19, 2020Updated 5 years ago
- ☆17Dec 11, 2024Updated last year
- Official codes for "Training Deep Q-Network via Monte Carlo Tree Search for Adaptive Bitrate Control in Video Delivery"☆10Jul 21, 2023Updated 2 years ago
- ☆12Jun 30, 2024Updated last year
- Eden Flux LoRA trainer and full-finetuning☆24Mar 21, 2025Updated 11 months ago
- Sabre360: simulation testbed for 360° videos☆14Oct 14, 2020Updated 5 years ago
- Code for "SCL-RAI: Span-based Contrastive Learning with Retrieval Augmented Inference for Unlabeled Entity Problem in NER" @COLING-2022☆11Aug 20, 2022Updated 3 years ago
- Analyzing Latent Concept in Pre-trained Transformer Models☆12Jul 18, 2022Updated 3 years ago
- ☆11Jun 15, 2019Updated 6 years ago
- 通用简单工具项目☆22Oct 6, 2024Updated last year
- Awesome_CV的中文版本,clone本项目到overleaf即 可轻松愉快编写自己的CV☆17May 24, 2024Updated last year
- An reconstruction of RL Introduction and its course materials for a more efficient entry☆21Mar 4, 2026Updated 2 weeks ago
- ☆24Oct 14, 2024Updated last year
- minimal-cost for training 0.5B R1-Zero☆811May 14, 2025Updated 10 months ago
- Source code for NeurIPS 2020 paper "Node Classification on Graphs with Few-Shot Novel Labels via Meta Transformed Network Embedding"☆10Nov 17, 2020Updated 5 years ago
- A list of research resources that I've appreciated.☆12Dec 10, 2019Updated 6 years ago
- Reproducible Language Agent Research☆34Jun 25, 2025Updated 8 months ago
- SCCD:基于会话的中文网络欺凌检测数据集☆18Mar 9, 2025Updated last year
- ☆11Mar 13, 2026Updated last week
- Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation☆90Nov 13, 2024Updated last year