JacksonCakes / vision-r1Links
☆11Updated 2 months ago
Alternatives and similar repositories for vision-r1
Users that are interested in vision-r1 are comparing it to the libraries listed below
Sorting:
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆36Updated last year
- A repository for research on medium sized language models.☆76Updated last year
- CycleQD is a framework for parameter space model merging.☆40Updated 4 months ago
- ☆16Updated 3 months ago
- A fast, local, and secure approach for training LLMs for coding tasks using GRPO with WebAssembly and interpreter feedback.☆26Updated 2 months ago
- Lego for GRPO☆28Updated last week
- Plug in & Play Pytorch Implementation of the paper: "Evolutionary Optimization of Model Merging Recipes" by Sakana AI☆29Updated 6 months ago
- ☆14Updated last year
- ☆19Updated this week
- MatFormer repo☆26Updated 5 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆57Updated 9 months ago
- SiriuS: Self-improving Multi-agent Systems via Bootstrapped Reasoning☆54Updated 2 months ago
- ☆49Updated 7 months ago
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆100Updated 3 months ago
- Official repo of paper LM2☆40Updated 3 months ago
- ☆22Updated last year
- ☆13Updated 5 months ago
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆67Updated 2 months ago
- Unofficial Implementation of Evolutionary Model Merging☆38Updated last year
- Train your own SOTA deductive reasoning model☆93Updated 3 months ago
- From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients. Ajay Jaiswal, Lu Yin, Zhenyu Zhang, Shiwei Liu,…☆47Updated last month
- Lottery Ticket Adaptation☆39Updated 6 months ago
- ☆21Updated 5 months ago
- ☆79Updated 9 months ago
- Official implementation of ECCV24 paper: POA☆24Updated 10 months ago
- The official implementation of Regularized Policy Gradient (RPG) (https://arxiv.org/abs/2505.17508)☆27Updated last week
- Simple GRPO scripts and configurations.☆58Updated 4 months ago
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆72Updated 2 weeks ago
- [ICLR 2025] SDTT: a simple and effective distillation method for discrete diffusion models☆27Updated 2 months ago
- Exploration of automated dataset selection approaches at large scales.☆42Updated 3 months ago