wey-gu / grpo-graph-extractionView external linksLinks
Qwen GRPO Graph Extraction RL Finetune
☆60Apr 2, 2025Updated 10 months ago
Alternatives and similar repositories for grpo-graph-extraction
Users that are interested in grpo-graph-extraction are comparing it to the libraries listed below
Sorting:
- ☆10Dec 19, 2025Updated last month
- [ICML 2025] Logits are All We Need to Adapt Closed Models☆21May 2, 2025Updated 9 months ago
- ☆25Oct 28, 2024Updated last year
- Test Environment Booking tool☆14Nov 16, 2020Updated 5 years ago
- Model Context Protocol Server for NebulaGraph 3.x☆26Mar 17, 2025Updated 10 months ago
- ☆27Aug 27, 2025Updated 5 months ago
- H.AI cookbook provides code examples and guides to help developers use models developed by H Company.☆65Feb 3, 2026Updated last week
- Chrome extension to add a link from each Arxiv page to the corresponding HF Paper page☆26Jan 4, 2024Updated 2 years ago
- dqn autoplay mario bros☆21Jul 24, 2017Updated 8 years ago
- A travel agent based on Qwen2.5, fine-tuned by SFT + DPO/PPO/GRPO using traveling question-answer dataset, a mindmap can be output using …☆52Nov 14, 2025Updated 3 months ago
- entropix style sampling + GUI☆27Oct 30, 2024Updated last year
- support BM25+vecetor☆29May 26, 2025Updated 8 months ago
- [EMNLP 2025] Med-PRM: Medical Reasoning Models with Stepwise, Guideline-verified Process Rewards☆60Sep 15, 2025Updated 4 months ago
- OpenVLA Lightweight Version(0.5B). It uses qwen2-0.5B and fine-tunes using mllm format, without occupying LLM's inherent tokens. It repre…☆15Jan 7, 2026Updated last month
- Learning from preferences is a common paradigm for fine-tuning language models. Yet, many algorithmic design decisions come into play. Ou…☆32Apr 20, 2024Updated last year
- OpenDAL fsspec integration☆33Jan 20, 2026Updated 3 weeks ago
- Code for ACL25-findings. An LLM-based agent simulation framework that simulates human behavior and generates dynamic, text-based social g…☆90Oct 23, 2025Updated 3 months ago
- An open source implementation of R1☆29Feb 6, 2026Updated last week
- Trading algorithm for Bitcoins in USD on quantconnect.com☆13Jan 12, 2018Updated 8 years ago
- ☆11Aug 9, 2018Updated 7 years ago
- 🎨 Professional multi-modal AI media generation CLI ✨ Generate videos, images & music with Google AI models 🎬 Interactive UI with bat…☆16Aug 5, 2025Updated 6 months ago
- ☆16Mar 24, 2024Updated last year
- PyTorch implementation of 'CLIP' (Radford et al., 2021) from scratch and training it on Flickr8k + Flickr30k☆11Mar 14, 2024Updated last year
- This is a tool that can make you run intel openVINO Demos and samples easily.☆11Jan 31, 2023Updated 3 years ago
- RLCar Gazebo v2☆12Jun 28, 2024Updated last year
- ☆13May 11, 2022Updated 3 years ago
- Autonomous navigation simulation of an agricultural robot during soil fertilization in open fields using ROS and Gazebo.☆10Apr 8, 2025Updated 10 months ago
- This repository is a reimplementation of the paper(BERT has a Mouth, and It Must Speak: BERT as a Markov Random Field Language Model: htt…☆11Nov 14, 2019Updated 6 years ago
- New York Times Scraper☆11Feb 19, 2024Updated last year
- Part of a research scholarship. I built a basic 2d driving sim with simulated lidar data to train Deep Q Neural Network. So far after abo…☆11Feb 15, 2017Updated 8 years ago
- 海思设备上部署阉割版yolov5☆13Nov 22, 2021Updated 4 years ago
- CLIP-based Adaptive Graph Attention Network for Large-Scale Unsupervised Multi-modal Hashing Retrieval☆10Mar 18, 2024Updated last year
- RDF Community Discussions. Ask anything here!☆13Apr 11, 2024Updated last year
- NLP on Korean news articles. Automatic topic extraction through dynamic clustering.☆12Sep 15, 2017Updated 8 years ago
- Optimized Generative Adversarial Network with Graph Convolutional Networks for Novel Molecule Design☆12Jan 2, 2024Updated 2 years ago
- A Kivy tutorial for PyOhio 2013☆14Apr 30, 2014Updated 11 years ago
- sgbm立体匹配算法以及生成点云☆12Jan 29, 2021Updated 5 years ago
- Natural Language Reinforcement Learning☆101Jul 30, 2025Updated 6 months ago
- [ICLR 2026] GRAPE: Group Representational Position Encoding (https://arxiv.org/abs/2512.07805)☆78Jan 27, 2026Updated 2 weeks ago