Gaiejj / omniairl
A trustworthy benchmark for IAIR Reinforcement Learning homework
☆9Updated 2 years ago
Alternatives and similar repositories for omniairl:
Users that are interested in omniairl are comparing it to the libraries listed below
- The homework of robos learning base.☆10Updated last year
- [CVPR' 25] Interleaved-Modal Chain-of-Thought☆28Updated this week
- ☆113Updated 2 months ago
- Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning.☆101Updated this week
- A complete introductory course to programming, computer systems and software development (continuously updating).☆12Updated last year
- A collection on the recent reproduction papers and projects on DeepSeek-R1☆29Updated 2 months ago
- Think or Not Think: A Study of Explicit Thinking in Rule-Based Visual Reinforcement Fine-Tuning☆22Updated 2 weeks ago
- MME-CoT: Benchmarking Chain-of-Thought in LMMs for Reasoning Quality, Robustness, and Efficiency☆100Updated 3 weeks ago
- [Blog 1] Recording a bug of grpo_trainer in some R1 projects☆19Updated 2 months ago
- [Neurips'24 Spotlight] Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought …☆302Updated 4 months ago
- SafeSora is a human preference dataset designed to support safety alignment research in the text-to-video generation field, aiming to enh…☆30Updated 8 months ago
- A curated list of awesome papers on dataset reduction, including dataset distillation (dataset condensation) and dataset pruning (coreset…☆55Updated 3 months ago
- [CVPR 2025 (Oral)] Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key☆48Updated 3 weeks ago
- [NeurIPS 2024] Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"☆145Updated last month
- A Self-Training Framework for Vision-Language Reasoning☆76Updated 3 months ago
- MM-PRM: An open implementation of Multimodal OmegaPRM and its corresponding training pipeline☆13Updated last month
- Official implementation of GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents☆40Updated last week
- A Comprehensive Survey on Evaluating Reasoning Capabilities in Multimodal Large Language Models.☆56Updated last month
- VoCoT: Unleashing Visually Grounded Multi-Step Reasoning in Large Multi-Modal Models☆53Updated 9 months ago
- [ICML 2024 Oral] Official code repository for MLLM-as-a-Judge.☆67Updated 2 months ago
- Official implementation of paper "SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference".☆93Updated last month
- Survey on Data-centric Large Language Models☆83Updated 9 months ago
- 🔥CVPR 2025 Multimodal Large Language Models Paper List☆136Updated last month
- [NeurIPSw'24] This repo is the official implementation of "MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simu…☆87Updated 3 months ago
- [NeurIPS 2024]Repos for "Visualization-of-Thought" dataset, construction code and evaluation.☆24Updated 6 months ago
- Official repository of DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models☆84Updated 7 months ago
- An Easy-to-use, Scalable and High-performance RLHF Framework designed for Multimodal Models.☆114Updated 2 weeks ago
- ☆93Updated last week
- [NeurIPS2023] Exploring Diverse In-Context Configurations for Image Captioning☆37Updated 5 months ago
- Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization☆86Updated last year