GuyARoss / CLIP-video-search
demo natural language video db using CLIP
☆23Updated 7 months ago
Alternatives and similar repositories for CLIP-video-search:
Users that are interested in CLIP-video-search are comparing it to the libraries listed below
- This code implements a versatile image search engine leveraging the CLIP model and FAISS, capable of processing both text-to-image and i…☆42Updated last year
- [PR 2024] A large Cross-Modal Video Retrieval Dataset with Reading Comprehension☆26Updated last year
- Code for CVPR 2022 paper "Scene Consistency Representation Learning for Video Scene Segmentation"☆94Updated 2 years ago
- ☆67Updated last year
- Chinese CLIP models with SOTA performance.☆54Updated last year
- A PyTorch implementation of SlowFast based on ICCV 2019 paper "SlowFast Networks for Video Recognition"☆13Updated 3 years ago
- Masked Vision-Language Transformer in Fashion☆33Updated last year
- [NeurIPS 2022 Spotlight] VideoMAE for Action Detection☆59Updated 2 years ago
- Use CLIP to represent video for Retrieval Task☆69Updated 4 years ago
- Fine-tuning Qwen2.5-VL for vision-language tasks | Optimized for Vision understanding | LoRA & PEFT support.☆33Updated last month
- Research Code for Multimodal-Cognition Team in Ant Group☆139Updated 8 months ago
- [CVPR 2022 - Demo Track] - Effective conditioned and composed image retrieval combining CLIP-based features☆78Updated 4 months ago
- ☆243Updated 2 years ago
- A Simple MLLM Surpassed QwenVL-Max with OpenSource Data Only in 14B LLM.☆37Updated 6 months ago
- Our 2nd-gen LMM☆33Updated 10 months ago
- Code for the Video Similarity Challenge.☆77Updated last year
- Repository for 23'MM accepted paper "Curriculum-Listener: Consistency- and Complementarity-Aware Audio-Enhanced Temporal Sentence Groundi…☆49Updated last year
- [ICLR2024] Codes and Models for COSA: Concatenated Sample Pretrained Vision-Language Foundation Model☆43Updated 3 months ago
- ☆19Updated 2 years ago
- A simple script that reads a directory of videos, grabs a random frame, and automatically discovers a prompt for it☆134Updated last year
- TransVCL: Attention-enhanced Video Copy Localization Network with Flexible Supervision [AAAI2023 Oral]]☆54Updated 2 years ago
- Chinese Stable Diffusion, zh SD,中文文生图,中文SD,中文Stable Diffusion☆48Updated last year
- CLIP中文encoder☆22Updated 2 years ago
- [ICCV 2023] Accurate and Fast Compressed Video Captioning☆39Updated last year
- Original Inference Repository of the Paper: "Domain-Adaptive Self-Supervised Pre-training for Face & Body Detection in Drawings"☆29Updated last year
- official code for paper: Exploring Domain Incremental Video Highlights Detection with the LiveFood Benchmark☆34Updated last year
- Facebook Image Similarity Challenge 2021☆19Updated 3 years ago
- You Only Watch One Frame for Online Spatio-Temporal Action Detection☆33Updated last year
- Official PyTorch implementation of the paper "Revisiting Temporal Modeling for CLIP-based Image-to-Video Knowledge Transferring"☆99Updated last year
- GroundVLP: Harnessing Zero-shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection (AAAI 2024)☆64Updated last year