JacksonCakes / vision-r1Links
☆12Updated 5 months ago
Alternatives and similar repositories for vision-r1
Users that are interested in vision-r1 are comparing it to the libraries listed below
Sorting:
- CycleQD is a framework for parameter space model merging.☆44Updated 7 months ago
- KV Cache Steering for Inducing Reasoning in Small Language Models☆39Updated last month
- PyTorch implementation of models from the Zamba2 series.☆185Updated 7 months ago
- A repository for research on medium sized language models.☆77Updated last year
- Lottery Ticket Adaptation☆39Updated 10 months ago
- Simple repository for training small reasoning models☆40Updated 7 months ago
- [ACL 2025] Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆105Updated 3 months ago
- ☆21Updated last month
- ☆77Updated last month
- Verifiers for LLM Reinforcement Learning☆72Updated 5 months ago
- ☆54Updated 10 months ago
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆172Updated 8 months ago
- Train, tune, and infer Bamba model☆132Updated 3 months ago
- Esoteric Language Models☆99Updated last month
- ☆35Updated 4 months ago
- A testbed for agents and environments that can automatically improve models through data generation.☆27Updated 6 months ago
- EvaByte: Efficient Byte-level Language Models at Scale☆109Updated 5 months ago
- ☆73Updated 2 months ago
- ☆55Updated 6 months ago
- Code and training scripts for FlexOlmo☆98Updated this week
- Official repository for "BLEUBERI: BLEU is a surprisingly effective reward for instruction following"☆25Updated 3 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆60Updated last year
- ☆85Updated last year
- Code for paper called Self-Training Elicits Concise Reasoning in Large Language Models☆41Updated 5 months ago
- ☆47Updated last year
- Source code for the collaborative reasoner research project at Meta FAIR.☆103Updated 5 months ago
- Code for RATIONALYST: Pre-training Process-Supervision for Improving Reasoning https://arxiv.org/pdf/2410.01044☆35Updated 11 months ago
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆95Updated this week
- ☆58Updated 4 months ago
- [ICML 2025] Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction☆68Updated 3 months ago