yongliang-wu / MM-VIDView external linksLinks
Open source implementation of the paper "MM-Vid: Advancing Video Understanding with GPT-4V(ision)".
☆39Jan 4, 2026Updated last month
Alternatives and similar repositories for MM-VID
Users that are interested in MM-VID are comparing it to the libraries listed below
Sorting:
- [AAAI2025] Video Repurposing from User Generated Content: A Large-scale Dataset and Benchmark☆26Jan 4, 2026Updated last month
- [AAAI2025] Unlearning Concepts in Diffusion Model via Concept Domain Correction and Concept Preserving Gradient☆44Apr 17, 2025Updated 10 months ago
- An in-context learning research testbed☆19Mar 16, 2025Updated 11 months ago
- Using distilled CLIP model to deploy the android device☆20Feb 28, 2023Updated 2 years ago
- [NeurIPS 2024] Visual Perception by Large Language Model’s Weights☆55Mar 31, 2025Updated 10 months ago
- (NeurIPS 2024 Spotlight) TOPA: Extend Large Language Models for Video Understanding via Text-Only Pre-Alignment☆29Sep 27, 2024Updated last year
- Python package to extract and analyse Canadian, United States and Indian real estate data from REALTOR.CA, REALTOR.COM and HOUSING.COM☆15Dec 21, 2025Updated last month
- ☆12Jan 21, 2025Updated last year
- ☆11Jun 11, 2024Updated last year
- A beautiful Astro theme based on Ghost Simply theme☆12Updated this week
- ☆13May 13, 2025Updated 9 months ago
- [ACM MM-24] Probabilistic Vision-Language Representation for Weakly Supervised Temporal Action Localization☆12Oct 8, 2024Updated last year
- Simple drawing app for Apple Pencil using MetalKit☆16Jan 17, 2026Updated last month
- ☆12Apr 25, 2025Updated 9 months ago
- Simple PDF Editor is a collection of Python scripts that can be used to perform basic operations on existing PDF files.☆11Jun 20, 2021Updated 4 years ago
- ☆11Mar 16, 2021Updated 4 years ago
- A Python file that cuts a video clip based on human-written instructions inside the FinalCut Pro video editor. See Jupiter notebook for d…☆11Jul 21, 2021Updated 4 years ago
- A hybrid quantum-classical neural network simulation platform. Quantum simulation uses QTensor, a state-of-the-art tensor network-based s…☆14Jun 27, 2023Updated 2 years ago
- Accompanying source code for the journal paper "A Baseline for General Music Object Detection with Deep Learning"☆12Updated this week
- [NAACL'25] "Revealing the Barriers of Language Agents in Planning"☆13Jun 22, 2025Updated 7 months ago
- ☆22Jan 12, 2026Updated last month
- 一个小小的书单,收集整理了一些计算机科学与技术方面的书籍英文原著pdf。☆10Jan 13, 2022Updated 4 years ago
- A mobile GUI search engine using a vision-language model☆14May 5, 2025Updated 9 months ago
- Offline Search Engine☆16Nov 8, 2025Updated 3 months ago
- ☆10Dec 3, 2021Updated 4 years ago
- Contains the model patches and the eval logs from the passing swe-bench-lite run.☆10Jun 28, 2024Updated last year
- Personalized Image Generation with Large Multimodal Models☆14May 13, 2025Updated 9 months ago
- ☆13Jun 18, 2024Updated last year
- ☆11Updated this week
- iOS Microphone Stethoscope to Measure Heart Rate using AudioKit and Custom FFT☆12Apr 11, 2020Updated 5 years ago
- Read German laws on the command line☆12Dec 13, 2025Updated 2 months ago
- Collection of different basic JavaScript projects created using vanilla JavaScript, html, CSS and Tailwind CSS.☆11Jul 1, 2024Updated last year
- ☆13May 30, 2021Updated 4 years ago
- Spacedrive native dependencies☆13Apr 8, 2025Updated 10 months ago
- libpostal wrapper python package for windows☆17Aug 12, 2023Updated 2 years ago
- official code for "3D Question Answering via only 2D Vision-Language Models"☆23Jan 15, 2026Updated last month
- GraphRag vs Embeddings☆16Jul 14, 2024Updated last year
- A javascript implementation of the YIN algorithm for Node.js and the browser☆12Feb 24, 2018Updated 7 years ago
- ☆13May 28, 2025Updated 8 months ago