GuyARoss / CLIP-video-search
demo natural language video db using CLIP
☆25Updated 8 months ago
Alternatives and similar repositories for CLIP-video-search:
Users that are interested in CLIP-video-search are comparing it to the libraries listed below
- Chinese CLIP models with SOTA performance.☆55Updated last year
- Repository for 23'MM accepted paper "Curriculum-Listener: Consistency- and Complementarity-Aware Audio-Enhanced Temporal Sentence Groundi…☆49Updated last year
- [PR 2024] A large Cross-Modal Video Retrieval Dataset with Reading Comprehension☆26Updated last year
- Code for CVPR 2022 paper "Scene Consistency Representation Learning for Video Scene Segmentation"☆94Updated 2 years ago
- Code and model for the AI City Challenge (CVPR 2022) Track 3 Action Detection (Naturalistic Driving Action Recognition)☆28Updated last year
- TransVCL: Attention-enhanced Video Copy Localization Network with Flexible Supervision [AAAI2023 Oral]]☆54Updated 2 years ago
- Facebook Image Similarity Challenge 2021☆19Updated 3 years ago
- ☆67Updated last year
- Video Copy Segment Localization (VCSL) dataset and benchmark [CVPR2022]☆124Updated last year
- A PyTorch implementation of SlowFast based on ICCV 2019 paper "SlowFast Networks for Video Recognition"☆13Updated 3 years ago
- This repository contains the dataset, codebase, and benchmarks for our paper: <CNVid-3.5M: Build, Filter, and Pre-train the Large-scale P…☆25Updated last year
- [NeurIPS 2022 Spotlight] VideoMAE for Action Detection☆62Updated 2 years ago
- Code for the Video Similarity Challenge.☆78Updated last year
- official code for paper: Exploring Domain Incremental Video Highlights Detection with the LiveFood Benchmark☆35Updated last year
- CLIP中文encoder☆22Updated 2 years ago
- [CVPR Challenge Rank 2nd] The codes and related files to reproduce the results for Video Similarity Challenge Descriptor Track.☆19Updated last week
- Our 2nd-gen LMM☆33Updated 11 months ago
- Research Code for Multimodal-Cognition Team in Ant Group☆142Updated 9 months ago
- Effective frame sampling for ML applications.☆18Updated 4 months ago
- Large Multimodal Model☆15Updated last year
- A multimodal large-scale model, which performs close to the closed-source Qwen-VL-PLUS on many datasets and significantly surpasses the p…☆14Updated last year
- Code for Recall@k Surrogate Loss with Large Batches and Similarity Mixup, CVPR 2022.☆63Updated 5 months ago
- Fine-tuning Qwen2.5-VL for vision-language tasks | Optimized for Vision understanding | LoRA & PEFT support.☆60Updated 2 months ago
- [CVPR 2023 Workshop] The code reproduce the results of our solutions on both tracks for Meta AI Video Similarity Challenge (CVPR 2023 Wor…☆49Updated last year
- 2019 CCF 大数据与计算智能大赛 视频版权检测算法 复赛第8名方案 | 8th place solution of Video Copyright Detection Algorithm Track, 2019 CCF Big Data & Computing Int…☆30Updated 5 years ago
- Official Implementation of our WACV2023 paper: “Holistic Interaction Transformer Network for Action Detection”☆66Updated 3 months ago
- This repository is a fork of https://github.com/joslefaure/HIT customized for the AVA dataset☆16Updated last year
- Using open-source LLM Llama2 by Meta on local CPU inference for document question-and-answer☆15Updated last year
- ☆113Updated last year
- This code implements a versatile image search engine leveraging the CLIP model and FAISS, capable of processing both text-to-image and i…☆43Updated last year