Jackymn25 / utm-professor-analysis-rmpLinks
Web crawing from rmp and data ranking
☆14Updated last month
Alternatives and similar repositories for utm-professor-analysis-rmp
Users that are interested in utm-professor-analysis-rmp are comparing it to the libraries listed below
Sorting:
- [CVPR 2025] Code for "Notes-guided MLLM Reasoning: Enhancing MLLM with Knowledge and Visual Notes for Visual Question Answering".☆15Updated last month
- A one-stop library to standardize the inference and evaluation of all the conditional video generation models.☆48Updated 5 months ago
- A benchmark dataset and simple code examples for measuring the perception and reasoning of multi-sensor Vision Language models.☆18Updated 6 months ago
- Retrieval Augmented Generation for youtube videos with a BRAD agent☆33Updated 5 months ago
- The official repository for CVPRW2024 paper "What’s in a Name? Beyond Class Indices for Image Recognition"☆13Updated 10 months ago
- INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model☆42Updated 11 months ago
- This is the offical page of WikiAutoGen, ICCV2025☆15Updated 3 weeks ago
- Pytorch implementation of HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models☆28Updated last year
- Implementation of the model: "(MC-ViT)" from the paper: "Memory Consolidation Enables Long-Context Video Understanding"☆20Updated 3 months ago
- Official implementation of "PyVision: Agentic Vision with Dynamic Tooling."☆69Updated last week
- Text-Guided Generation of Full-Body Image with Preserved Reference Face for Customized Animation☆24Updated last year
- Fine-tune of Florence-2 for shot categorization.☆26Updated 4 months ago
- ☆13Updated last year
- [CVPR 2025 ] ScaleLSD: Scalable Deep Line Segment Detection Streamlined☆16Updated 3 weeks ago
- Official implementation of "VSTAR: Generative Temporal Nursing for Longer Dynamic Video Synthesis"☆19Updated 5 months ago
- ☆16Updated last year
- This repo contains the official PyTorch implementation of vLMIG: Improving Visual Commonsense in Language Models via Multiple Image Gener…☆16Updated last year
- Official implementation of the paper "MMInA: Benchmarking Multihop Multimodal Internet Agents"☆46Updated 4 months ago
- ☆14Updated last year
- ☆22Updated 2 months ago
- Anim-Director: Controllable Animation Video Generation with Large Models-based Multimodal Agents☆81Updated last month
- ☆20Updated 4 months ago
- PresentAgent: Multimodal Agent for Presentation Video Generation☆52Updated this week
- we propose FlexEdit, an end-to-end image editing method that leverages both free-shape masks and language instructions for Flexible Editi…☆32Updated 10 months ago
- Official implementation of "Self-Correcting Self-Consuming Loops for Generative Model Training" (ICML 2024)☆33Updated last year
- Official PyTorch implementation of "No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding"☆33Updated last year
- A curated list of papers and resources for text-to-image evaluation.☆29Updated last year
- ☆35Updated last week
- ☆16Updated last year
- ☆18Updated 8 months ago