☆285Jan 29, 2026Updated last month
Alternatives and similar repositories for digitalhuman
Users that are interested in digitalhuman are comparing it to the libraries listed below
Sorting:
- [ICLR 26] The official code repository for the paper "Mirage or Method? How Model–Task Alignment Induces Divergent RL Conclusions".☆15Feb 9, 2026Updated 3 weeks ago
- [ICLR 2026] Efficient Agent Training for Computer Use☆138Sep 5, 2025Updated 5 months ago
- From Word to World: Can Large Language Models be Implicit Text-based World Models?☆48Dec 25, 2025Updated 2 months ago
- ☆87Aug 16, 2025Updated 6 months ago
- A Recipe for Building LLM Reasoners to Solve Complex Instructions☆29Oct 9, 2025Updated 4 months ago
- MiroMind-M1 is a fully open-source series of reasoning language models built on Qwen-2.5, focused on advancing mathematical reasoning.☆256Aug 12, 2025Updated 6 months ago
- Codebase for Instruction Following without Instruction Tuning☆36Sep 24, 2024Updated last year
- PreAct: Prediction Enhances Agent's Planning Ability (Coling2025)☆30Dec 12, 2024Updated last year
- [AAAI 2026] The Avengers: A Simple Recipe for Uniting Smaller Language Models to Challenge Proprietary Giants☆46Dec 11, 2025Updated 2 months ago
- RM-R1: Unleashing the Reasoning Potential of Reward Models☆159Jun 26, 2025Updated 8 months ago
- ☆16Jul 23, 2024Updated last year
- [NAACL 2025] Representing Rule-based Chatbots with Transformers☆23Feb 9, 2025Updated last year
- Emergent Hierarchical Reasoning in LLMs/VLMs through Reinforcement Learning☆62Oct 24, 2025Updated 4 months ago
- A highly capable 2.4B lightweight LLM using only 1T pre-training data with all details.☆224Jul 25, 2025Updated 7 months ago
- Source code of our EMNLP 2024 paper "FactAlign: Long-form Factuality Alignment of Large Language Models"☆19Oct 3, 2024Updated last year
- Source code for our paper: "ARIA: Training Language Agents with Intention-Driven Reward Aggregation".☆26Aug 9, 2025Updated 6 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆61Aug 30, 2024Updated last year
- B-STAR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners☆86May 21, 2025Updated 9 months ago
- ☆111Jan 8, 2025Updated last year
- [NeurIPS 2025] Elevating Visual Perception in Multimodal LLMs with Visual Embedding Distillation☆71Oct 17, 2025Updated 4 months ago
- SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward☆92Aug 8, 2025Updated 6 months ago
- Extrapolating RLVR to General Domains without Verifiers☆201Aug 12, 2025Updated 6 months ago
- Twinkle✨: Training workbench to make your model glow.☆45Updated this week
- ☆25Aug 19, 2025Updated 6 months ago
- [EMNLP 2025] WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement Learning☆75Nov 4, 2025Updated 4 months ago
- A benchmark dataset designed to support the development and evaluation of large language models (LLMs) for conversational mental health a…☆16Feb 24, 2025Updated last year
- Optimizing Anytime Reasoning via Budget Relative Policy Optimization☆52Jul 15, 2025Updated 7 months ago
- Feedback-Driven Tool-Use Improvements in Large Language Models via Automated Build Environments☆48Jan 8, 2026Updated last month
- ☆115Jun 11, 2025Updated 8 months ago
- ☆74Jun 28, 2025Updated 8 months ago
- This is the official GitHub repository for our survey paper "Beyond Single-Turn: A Survey on Multi-Turn Interactions with Large Language …☆174May 14, 2025Updated 9 months ago
- ☆14Mar 5, 2024Updated last year
- ☆14Jan 24, 2025Updated last year
- [ICML 2025] LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models☆17Nov 4, 2025Updated 4 months ago
- ☆16Jul 29, 2025Updated 7 months ago
- ☆524Feb 4, 2026Updated last month
- ☆60Jan 12, 2026Updated last month
- Instruct Once, Chat Consistently in Multiple Rounds: An Efficient Tuning Framework for Dialogue (ACL 2024)☆25Oct 18, 2025Updated 4 months ago
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learning☆123May 6, 2025Updated 9 months ago