menloresearch / visual-thinker
☆150Updated 2 months ago
Alternatives and similar repositories for visual-thinker:
Users that are interested in visual-thinker are comparing it to the libraries listed below
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆172Updated 3 months ago
- Build your own visual reasoning model☆357Updated this week
- ☆84Updated last week
- Tina: Tiny Reasoning Models via LoRA☆164Updated 2 weeks ago
- Official repo for paper: "Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't"☆220Updated last month
- ☆199Updated 2 months ago
- Implementation of Mind Evolution, Evolving Deeper LLM Thinking, from Deepmind☆49Updated 3 months ago
- official repository for “Reinforcement Learning for Reasoning in Large Language Models with One Training Example”☆118Updated this week
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆322Updated 4 months ago
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks☆186Updated 3 weeks ago
- Exploring Applications of GRPO☆189Updated last week
- EvaByte: Efficient Byte-level Language Models at Scale☆91Updated 2 weeks ago
- Train your own SOTA deductive reasoning model☆91Updated 2 months ago
- Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"☆307Updated 5 months ago
- Official PyTorch implementation for Hogwild! Inference: Parallel LLM Generation with a Concurrent Attention Cache☆98Updated 2 weeks ago
- Code for ExploreTom☆81Updated 4 months ago
- Dream 7B, a large diffusion language model☆613Updated this week
- Code for "Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate"☆141Updated 2 weeks ago
- ☆170Updated 2 weeks ago
- A framework to study AI models in Reasoning, Alignment, and use of Memory (RAM).☆226Updated this week
- Official codebase for "Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling".☆253Updated 2 months ago
- Official Implementation for the paper "d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning"☆100Updated 3 weeks ago
- minimal GRPO implementation from scratch☆87Updated last month
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)☆189Updated 11 months ago
- GRadient-INformed MoE☆262Updated 7 months ago
- PyTorch building blocks for the OLMo ecosystem☆205Updated this week
- Rethinking Step-by-step Visual Reasoning in LLMs☆292Updated 3 months ago
- A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learning☆150Updated last week
- ☆287Updated last month
- ☆268Updated this week