dannyXSC / Fudan_FreshmanTest
复旦研究生入学教育测试
☆14Updated last year
Alternatives and similar repositories for Fudan_FreshmanTest
Users that are interested in Fudan_FreshmanTest are comparing it to the libraries listed below
Sorting:
- R1-like Video-LLM for Temporal Grounding☆88Updated last month
- Collections of Papers and Projects for Multimodal Reasoning.☆104Updated 2 weeks ago
- [ICLR2025] Official code implementation of Video-UTR: Unhackable Temporal Rewarding for Scalable Video MLLMs☆52Updated 2 months ago
- [CVPR 2025] OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?☆56Updated last month
- A python script for downloading huggingface datasets and models.☆19Updated last month
- A paper list for spatial reasoning☆60Updated last month
- Official repository of DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models☆84Updated 8 months ago
- Latest Advances on (RL based) Multimodal Reasoning and Generation in Multimodal Large Language Models☆24Updated this week
- [CVPR 2025] The code for paper ''Video-3D LLM: Learning Position-Aware Video Representation for 3D Scene Understanding''.☆99Updated 3 weeks ago
- [LLaVA-Video-R1]✨First Adaptation of R1 to LLaVA-Video (2025-03-18)☆28Updated last week
- 📖 This is a repository for organizing papers, codes, and other resources related to unified multimodal models.☆184Updated this week
- 🔥CVPR 2025 Multimodal Large Language Models Paper List☆142Updated 2 months ago
- RoboFactory: Exploring Embodied Agent Collaboration with Compositional Constraints☆44Updated last month
- Official repo of VLABench, a large scale benchmark designed for fairly evaluating VLA, Embodied Agent, and VLMs.☆217Updated 2 weeks ago
- AL-Ref-SAM 2: Unleashing the Temporal-Spatial Reasoning Capacity of GPT for Training-Free Audio and Language Referenced Video Object Segm…☆81Updated 4 months ago
- [Blog 1] Recording a bug of grpo_trainer in some R1 projects☆20Updated 2 months ago
- Official repository of NeurIPS D&B Track 2024 paper "VERIFIED: A Video Corpus Moment Retrieval Benchmark for Fine-Grained Video Understan…☆34Updated 3 months ago
- TinyLLaVA-Video-R1: Towards Smaller LMMs for Video Reasoning☆65Updated last week
- Official implementation of "Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness".☆22Updated last month
- ⭐️ Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning.☆136Updated last week
- ☆117Updated 3 months ago
- Official code for MotionBench (CVPR 2025)☆37Updated 2 months ago
- [CVPR 2025] Official PyTorch code of "Enhancing Video-LLM Reasoning via Agent-of-Thoughts Distillation".☆28Updated 2 weeks ago
- MetaSpatial leverages reinforcement learning to enhance 3D spatial reasoning in vision-language models (VLMs), enabling more structured, …☆114Updated last week
- [CVPR 2025] Online Video Understanding: OVBench and VideoChat-Online☆35Updated last month
- Accepted by CVPR 2024☆33Updated last year
- A Comprehensive Survey on Evaluating Reasoning Capabilities in Multimodal Large Language Models.☆60Updated last month
- [CVPR 2025] Adaptive Keyframe Sampling for Long Video Understanding☆60Updated 3 weeks ago
- ☆83Updated last month
- (CVPR 2025) PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction☆94Updated 2 months ago