Open source implementation of the paper "MM-Vid: Advancing Video Understanding with GPT-4V(ision)".
☆39Jan 4, 2026Updated 2 months ago
Alternatives and similar repositories for MM-VID
Users that are interested in MM-VID are comparing it to the libraries listed below
Sorting:
- [AAAI2025] Video Repurposing from User Generated Content: A Large-scale Dataset and Benchmark☆27Jan 4, 2026Updated 2 months ago
- [NeurIPS2023] Exploring Diverse In-Context Configurations for Image Captioning☆43Nov 26, 2024Updated last year
- [AAAI2025] Unlearning Concepts in Diffusion Model via Concept Domain Correction and Concept Preserving Gradient☆44Apr 17, 2025Updated 10 months ago
- Training Vision Transformers for Semi-Supervised Semantic Segmentation☆14Nov 3, 2025Updated 4 months ago
- An in-context learning research testbed☆19Mar 16, 2025Updated 11 months ago
- ☆20Sep 19, 2023Updated 2 years ago
- [NeurIPS 2024] Visual Perception by Large Language Model’s Weights☆56Mar 31, 2025Updated 11 months ago
- [NIPS 25'] Evaluation code of paper "KRIS-Bench: Benchmarking Next-Level Intelligent Image Editing Models"☆40Oct 19, 2025Updated 4 months ago
- ☆12Jan 21, 2025Updated last year
- Python package to extract and analyse Canadian, United States and Indian real estate data from REALTOR.CA, REALTOR.COM and HOUSING.COM☆16Dec 21, 2025Updated 2 months ago
- ☆11Jun 11, 2024Updated last year
- ☆11May 24, 2024Updated last year
- Simple drawing app for Apple Pencil using MetalKit☆16Jan 17, 2026Updated last month
- Offline Search Engine☆16Nov 8, 2025Updated 4 months ago
- Code and data repository for ISMIR 2019 paper: MIDI–SHEET MUSIC ALIGNMENT USING BOOTLEG SCORE SYNTHESIS☆12Mar 1, 2022Updated 4 years ago
- Manipulate panels in window for Electron.☆14May 27, 2017Updated 8 years ago
- ☆11Dec 16, 2024Updated last year
- Neural combinatorial optimization with equivariant quantum circuits.☆12May 13, 2022Updated 3 years ago
- 一个小小的书单,收集整理了一些计算机科学与技术方面的书籍英文原著pdf。☆10Jan 13, 2022Updated 4 years ago
- 诺亚舟学习机的模拟器☆10Aug 17, 2019Updated 6 years ago
- ☆34Jan 9, 2026Updated 2 months ago
- [ACM MM-24] Probabilistic Vision-Language Representation for Weakly Supervised Temporal Action Localization☆12Oct 8, 2024Updated last year
- Personalized Image Generation with Large Multimodal Models☆14May 13, 2025Updated 9 months ago
- ☆10Dec 3, 2021Updated 4 years ago
- A Python file that cuts a video clip based on human-written instructions inside the FinalCut Pro video editor. See Jupiter notebook for d…☆11Jul 21, 2021Updated 4 years ago
- Contains the model patches and the eval logs from the passing swe-bench-lite run.☆10Jun 28, 2024Updated last year
- [NAACL'25] "Revealing the Barriers of Language Agents in Planning"☆13Jun 22, 2025Updated 8 months ago
- VideoNIAH: A Flexible Synthetic Method for Benchmarking Video MLLMs☆54Mar 9, 2025Updated last year
- GraphRag vs Embeddings☆16Jul 14, 2024Updated last year
- A Python script that utilizes OpenAI's GPT-3.5-turbo model to process and summarize long-form text files.☆11Mar 19, 2023Updated 2 years ago
- ☆13Jun 11, 2025Updated 8 months ago
- A javascript implementation of the YIN algorithm for Node.js and the browser☆12Feb 24, 2018Updated 8 years ago
- official code for "3D Question Answering via only 2D Vision-Language Models"☆23Updated this week
- PyTorch Implementation of [AudioLCM]: a efficient and high-quality text-to-audio generation with latent consistency model.☆13Jun 15, 2024Updated last year
- Collection of different basic JavaScript projects created using vanilla JavaScript, html, CSS and Tailwind CSS.☆11Jul 1, 2024Updated last year
- Read German laws on the command line☆14Dec 13, 2025Updated 2 months ago
- ☆19Sep 11, 2025Updated 5 months ago
- ☆13May 28, 2025Updated 9 months ago
- Remon-OBS-Studio(ROS): Program that broadcasts using WebRTC and is based on obs-studio proejct and Pion WebRTC project.☆11Sep 6, 2019Updated 6 years ago