OpenMOSS / LorsaLinks
☆19Updated 3 weeks ago
Alternatives and similar repositories for Lorsa
Users that are interested in Lorsa are comparing it to the libraries listed below
Sorting:
- AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories☆15Updated 2 weeks ago
- ☆16Updated 2 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆57Updated 9 months ago
- ☆29Updated 2 weeks ago
- How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training☆34Updated last month
- ☆64Updated 2 months ago
- Official Code Release for "Training a Generally Curious Agent"☆21Updated 2 weeks ago
- ☆79Updated 9 months ago
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling☆32Updated 2 months ago
- ☆21Updated 5 months ago
- ☆32Updated 4 months ago
- Reinforcing General Reasoning without Verifiers☆33Updated last week
- The official implementation of Regularized Policy Gradient (RPG) (https://arxiv.org/abs/2505.17508)☆27Updated this week
- ☆24Updated 8 months ago
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆67Updated 2 months ago
- Agent Skill Induction: "Inducing Programmatic Skills for Agentic Tasks"☆20Updated last month
- ☆38Updated this week
- ☆50Updated this week
- This repo is based on https://github.com/jiaweizzhao/GaLore☆28Updated 8 months ago
- Process Reward Models That Think☆38Updated this week
- ☆11Updated 10 months ago
- Code Implementation, Evaluations, Documentation, Links and Resources for Min P paper☆33Updated 2 months ago
- [ICML 2025] Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples☆88Updated last week
- Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)☆31Updated 2 months ago
- The official code repo and data hub of top_nsigma sampling strategy for LLMs.☆25Updated 3 months ago
- Verifiers for LLM Reinforcement Learning☆55Updated last month
- Scaling Computer-Use Grounding via UI Decomposition and Synthesis☆49Updated this week
- A repository for research on medium sized language models.☆76Updated last year
- Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆90Updated 2 months ago
- ☆49Updated 6 months ago