GRPO Algorithm for Llava Architecture (Based on Verl)
☆49May 9, 2025Updated 10 months ago
Alternatives and similar repositories for GRPO-for-Llava
Users that are interested in GRPO-for-Llava are comparing it to the libraries listed below
Sorting:
- ☆20Jun 13, 2025Updated 8 months ago
- FakeReasoning: Towards Generalizable Forgery Detection and Reasoning.☆15Aug 28, 2025Updated 6 months ago
- ☆29Jul 14, 2025Updated 7 months ago
- The code implementation of the <Achieving a Better Stability-Plasticity Trade-off via Auxiliary Networks in Continual Learning> in The Co…☆14May 25, 2023Updated 2 years ago
- [ACM MM24 Poster] Official implementation of paper "MVPbev: Multi-view Perspective Image Generation from BEV with Test-time Controllabili…☆20Sep 6, 2025Updated 6 months ago
- Official implementation of Layout-aware Dreamer for Embodied Referring Expression Grounding [AAAI 23].☆16Apr 13, 2023Updated 2 years ago
- [AAAI2024] Summarizing Stream Data for Memory-Restricted Online Continual Learning☆21Apr 30, 2024Updated last year
- [ECCV2024] Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models☆20Jul 17, 2024Updated last year
- MDPO: Overcoming the Training-Inference Divide of Masked Diffusion Language Models☆40Jan 28, 2026Updated last month
- Official PyTorch implementation of "Hyperbolic VAE via Latent Gaussian Distributions"☆23Oct 26, 2023Updated 2 years ago
- Official PyTorch implementation of the paper "Accelerating Diffusion Large Language Models with SlowFast Sampling: The Three Golden Princ…☆41Jul 18, 2025Updated 7 months ago
- Graph Debiased Contrastive Learning with Joint Representation Clustering☆25May 10, 2023Updated 2 years ago
- [CVPR-2023] The official dataset of Advancing Visual Grounding with Scene Knowledge: Benchmark and Method.☆33Jul 12, 2023Updated 2 years ago
- Geometric Adversarial Attacks and Defenses on 3D Point Clouds (3DV 2021)☆26Jun 25, 2023Updated 2 years ago
- ☆18Jun 10, 2025Updated 8 months ago
- [ICLR 2025] Official PyTorch Implementation for CPE: Concept Pinpoint Eraser for Text-to-image Diffusion Models via Residual Attention Ga…☆12Apr 7, 2025Updated 11 months ago
- [ICRA 2024] WLST: Weak Labels Guided Self-training for Weakly-supervised Domain Adaptation on 3D Object Detection☆12Feb 6, 2024Updated 2 years ago
- Official repository for the NuScenes-MQA. This paper is accepted by LLVA-AD Workshop at WACV 2024.☆35Dec 21, 2023Updated 2 years ago
- 字节跳动瓜最终真实情况,用事实说话,正义会迟到但不会缺席!☆23Oct 18, 2024Updated last year
- [CVPR 2024] The official implementation of paper "Sculpting Holistic 3D Representation in Contrastive Language-Image-3D Pre-training"☆36Apr 21, 2024Updated last year
- The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]☆183Jun 5, 2025Updated 9 months ago
- [NeurIPS 2023] Official Implementation of A Generic Active Learning Baseline for LiDAR Semantic Segmentation☆32Apr 26, 2024Updated last year
- Official Repo For AAAI 2026 Accepted Paper "Rethinking the Spatio-Temporal Alignment of End-to-End 3D Perception"☆29Jan 13, 2026Updated last month
- Implementation of the CVPR2025 paper LoTUS: Large-Scale Machine Unlearning with a Taste of Uncertainty.☆17Sep 10, 2025Updated 5 months ago
- Code for our EMNLP 2022 paper: Generative Entity Typing with Curriculum Learning.☆13Aug 19, 2023Updated 2 years ago
- [ICLR 23] Contrastive Aligned of Vision to Language Through Parameter-Efficient Transfer Learning☆40Jul 29, 2023Updated 2 years ago
- For paper《Gaussian Transformer: A Lightweight Approach for Natural Language Inference》☆28Feb 23, 2020Updated 6 years ago
- Code for paper: Nullu: Mitigating Object Hallucinations in Large Vision-Language Models via HalluSpace Projection☆52Mar 13, 2025Updated 11 months ago
- ☆32Sep 24, 2023Updated 2 years ago
- The official implementation of "EnvScaler: Scaling Tool-Interactive Environments for LLM Agent via Programmatic Synthesis".☆105Feb 12, 2026Updated 3 weeks ago
- Proof-carrying code completions in Dafny☆11Apr 4, 2025Updated 11 months ago
- [ICLR 2026] SpatialLadder: Progressive Training for Spatial Reasoning in Vision-Language Models☆75Jan 29, 2026Updated last month
- ☆14Aug 28, 2024Updated last year
- Documentation at☆14Mar 27, 2025Updated 11 months ago
- Policy Optimization is awesome, let’s put a tree on it! 🌲🌟☆22Jul 4, 2025Updated 8 months ago
- The homework of robos learning base.☆11May 23, 2023Updated 2 years ago
- Implementation for the paper "Unified Multimodal Model with Unlikelihood Training for Visual Dialog"☆13May 12, 2023Updated 2 years ago
- [MM2024 Oral] 3D-GRES: Generalized 3D Referring Expression Segmentation☆42Dec 15, 2024Updated last year
- [ICLR 2026] Thinking on the Fly: Test-Time Reasoning Enhancement via Latent Thought Policy Optimization☆18Feb 14, 2026Updated 3 weeks ago