GuyARoss / CLIP-video-searchLinks
demo natural language video db using CLIP
☆27Updated last year
Alternatives and similar repositories for CLIP-video-search
Users that are interested in CLIP-video-search are comparing it to the libraries listed below
Sorting:
- Chinese CLIP models with SOTA performance.☆57Updated 2 years ago
- ☆70Updated 2 years ago
- [PR 2024] A large Cross-Modal Video Retrieval Dataset with Reading Comprehension☆27Updated last year
- Repository for 23'MM accepted paper "Curriculum-Listener: Consistency- and Complementarity-Aware Audio-Enhanced Temporal Sentence Groundi…☆50Updated last year
- Codebase for the Recognize Anything Model (RAM)☆83Updated last year
- Code for the Video Similarity Challenge.☆80Updated last year
- 2019 CCF 大数据与计算智能大赛 视频版权检测算法 复赛第8名方案 | 8th place solution of Video Copyright Detection Algorithm Track, 2019 CCF Big Data & Computing Int…☆30Updated 5 years ago
- Research Code for Multimodal-Cognition Team in Ant Group☆164Updated last month
- Code and model for the AI City Challenge (CVPR 2022) Track 3 Action Detection (Naturalistic Driving Action Recognition)☆29Updated 2 years ago
- Our 2nd-gen LMM☆34Updated last year
- Florence-2☆69Updated 6 months ago
- Facebook Image Similarity Challenge 2021☆19Updated 3 years ago
- TransVCL: Attention-enhanced Video Copy Localization Network with Flexible Supervision [AAAI2023 Oral]]☆55Updated 2 years ago
- A Simple MLLM Surpassed QwenVL-Max with OpenSource Data Only in 14B LLM.☆38Updated 11 months ago
- ☆29Updated 3 years ago
- Low-latency ONNX and TensorRT based zero-shot classification and detection with contrastive language-image pre-training based prompts☆43Updated last year
- TagGPT: Large Language Models are Zero-shot Multimodal Taggers☆63Updated 2 years ago
- Use CLIP to represent video for Retrieval Task☆70Updated 4 years ago
- It is a simple python tool to extract key-frames from a video file using peak estimation from frame difference.☆184Updated 2 months ago
- ☆18Updated 2 years ago
- Multimodal chatbot with computer vision capabilities integrated, our 1st-gen LMM☆100Updated last year
- Condensed Movies Challenge 2021☆19Updated 2 years ago
- Easiest way of fine-tuning HuggingFace video classification models☆142Updated 2 years ago
- Toward Universal Multimodal Embedding☆55Updated last month
- This code implements a versatile image search engine leveraging the CLIP model and FAISS, capable of processing both text-to-image and i…☆48Updated last year
- This repository contains the dataset, codebase, and benchmarks for our paper: <CNVid-3.5M: Build, Filter, and Pre-train the Large-scale P…☆25Updated last year
- Masked Vision-Language Transformer in Fashion☆35Updated last year
- Offical Code for GPT4Video: A Unified Multimodal Large Language Model for lnstruction-Followed Understanding and Safety-Aware Generation☆142Updated 10 months ago
- This repository contains the codebase for MovieCLIP: Visual Scene Recognition in Movies☆41Updated last year
- Using VideoBERT to tackle video prediction☆130Updated 4 years ago