Yueeeeeeee / HRPOLinks
Hybrid Latent Reasoning via Reinforcement Learning
☆120Updated 3 weeks ago
Alternatives and similar repositories for HRPO
Users that are interested in HRPO are comparing it to the libraries listed below
Sorting:
- SQL-o1: A Self-Reward Heuristic Dynamic Search Method for Text-to-SQL☆185Updated last month
- [ACL 25 main] Deliberate Reasoning in Language Models as Structure-Aware Planning with an Accurate World Model☆34Updated last month
- ☆63Updated 2 weeks ago
- Official code of "StreamBP: Memory-Efficient Exact Backpropagation for Long Sequence Training of LLMs".☆42Updated last week
- [ACL'25] Code for "Aligning Large Language Models to Follow Instructions and Hallucinate Less via Effective Data Filtering"☆20Updated 3 weeks ago
- [EMNLP 2024 Findings] Official PyTorch Implementation of "Adaptive Contrastive Search: Uncertainty-Guided Decoding for Open-Ended Text Ge…☆39Updated 4 months ago
- StrategyLLM: Large Language Models as Strategy Generators, Executors, Optimizers, and Evaluators for Problem Solving☆22Updated 6 months ago
- ☆48Updated 8 months ago
- Code of Journey to the Center of the Knowledge Neurons: Discoveries of Language-Independent Knowledge Neurons and Degenerate Knowledge Ne…☆26Updated last year
- A collection of papers related to knowledge fusion☆56Updated 8 months ago
- ☆84Updated last week
- ☆54Updated last week
- Collecting personality-indicative data for role-playing agents.☆22Updated 4 months ago
- ACL 2024☆32Updated 9 months ago
- [EMNLP 2024] DA-Code: Agent Data Science Code Generation Benchmark for Large Language Models☆70Updated 2 weeks ago
- Official Code of Logits-Based-Finetuning☆85Updated last week
- RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response☆41Updated 6 months ago
- ☆36Updated last year
- ☆163Updated last week
- [COLING Demos 2025] an Easy-to-use Tool for Comprehensive Response Evaluation of LLMs☆36Updated 3 months ago
- Your efficient and accurate answer verification system for RL training.☆30Updated this week
- ☆75Updated 2 weeks ago
- [ACL 2023 findings] Towards Robust Personalized Dialogue Generation via Order-Insensitive Representation Regularization☆17Updated last year
- EvaLearn is a pioneering benchmark designed to evaluate large language models (LLMs) on their learning capability and efficiency in chall…☆120Updated last week
- An open-source highly heterogeneous entity alignment (HHEA) toolkit.☆31Updated last year
- ☆101Updated 3 weeks ago
- Two Heads are Better Than One: Test-time Scaling of Multi-agent Collaborative Reasoning☆75Updated 2 months ago
- An Extensible Framework for Retrieval-Augmented LLM Applications: Learning Relevance Beyond Simple Similarity.☆39Updated 6 months ago
- ☆65Updated 8 months ago
- ☆53Updated last month