GuyARoss / CLIP-video-searchLinks
demo natural language video db using CLIP
☆27Updated last year
Alternatives and similar repositories for CLIP-video-search
Users that are interested in CLIP-video-search are comparing it to the libraries listed below
Sorting:
- Chinese CLIP models with SOTA performance.☆59Updated 2 years ago
- [PR 2024] A large Cross-Modal Video Retrieval Dataset with Reading Comprehension☆28Updated last year
- ☆72Updated 2 years ago
- Code for the Video Similarity Challenge.☆80Updated last year
- Easiest way of fine-tuning HuggingFace video classification models☆145Updated 2 years ago
- ☆18Updated 2 years ago
- Toward Universal Multimodal Embedding☆64Updated 2 months ago
- menovideo: pytorch library for video action recognition and video understanding☆29Updated 4 years ago
- Code and model for the AI City Challenge (CVPR 2022) Track 3 Action Detection (Naturalistic Driving Action Recognition)☆28Updated 2 years ago
- Our 2nd-gen LMM☆34Updated last year
- Multimodal chatbot with computer vision capabilities integrated, our 1st-gen LMM☆101Updated last year
- ☆29Updated 3 years ago
- Research Code for Multimodal-Cognition Team in Ant Group☆168Updated 2 weeks ago
- Repository for 23'MM accepted paper "Curriculum-Listener: Consistency- and Complementarity-Aware Audio-Enhanced Temporal Sentence Groundi…☆51Updated last year
- TransVCL: Attention-enhanced Video Copy Localization Network with Flexible Supervision [AAAI2023 Oral]]☆56Updated 2 years ago
- Low-latency ONNX and TensorRT based zero-shot classification and detection with contrastive language-image pre-training based prompts☆43Updated last year
- Codebase for the Recognize Anything Model (RAM)☆85Updated last year
- Facebook Image Similarity Challenge 2021☆19Updated 3 years ago
- Use CLIP to represent video for Retrieval Task☆70Updated 4 years ago
- [CVPR 2023 Workshop] The code reproduce the results of our solutions on both tracks for Meta AI Video Similarity Challenge (CVPR 2023 Wor…☆53Updated 2 years ago
- [CVPR Challenge Rank 2nd] The codes and related files to reproduce the results for Video Similarity Challenge Descriptor Track.☆19Updated 6 months ago
- Vision-oriented multimodal AI☆49Updated last year
- ☆28Updated 4 years ago
- A Simple MLLM Surpassed QwenVL-Max with OpenSource Data Only in 14B LLM.☆38Updated last year
- [ICLR2024] Codes and Models for COSA: Concatenated Sample Pretrained Vision-Language Foundation Model☆43Updated 10 months ago
- This repository contains the dataset, codebase, and benchmarks for our paper: <CNVid-3.5M: Build, Filter, and Pre-train the Large-scale P…☆25Updated last year
- A PyTorch implementation of SlowFast based on ICCV 2019 paper "SlowFast Networks for Video Recognition"☆14Updated 4 years ago
- 1st Place Solution in Google Universal Image Embedding☆67Updated 2 years ago
- Using open-source LLM Llama2 by Meta on local CPU inference for document question-and-answer☆15Updated 2 years ago
- Exploration of the multi modal fuyu-8b model of Adept. 🤓 🔍☆27Updated last year