NTUYWANG103 / clip-image-searchLinks
This code implements a versatile image search engine leveraging the CLIP model and FAISS, capable of processing both text-to-image and image-to-image queries.
☆48Updated last year
Alternatives and similar repositories for clip-image-search
Users that are interested in clip-image-search are comparing it to the libraries listed below
Sorting:
- A simple image search engine using CLIP feature.☆73Updated 2 years ago
- Chinese CLIP models with SOTA performance.☆59Updated 2 years ago
- Repository for 23'MM accepted paper "Curriculum-Listener: Consistency- and Complementarity-Aware Audio-Enhanced Temporal Sentence Groundi…☆51Updated last year
- ☆72Updated 2 years ago
- Codebase for the Recognize Anything Model (RAM)☆87Updated last year
- Chinese Stable Diffusion, zh SD,中文文生图,中文SD,中文Stable Diffusion☆49Updated last year
- Image Editing Anything☆116Updated 2 years ago
- ☆183Updated 3 months ago
- MuLan: Adapting Multilingual Diffusion Models for 110+ Languages (无需额外训练为任意扩散模型支持多语言能力)☆141Updated 9 months ago
- Precision Search through Multi-Style Inputs☆72Updated 3 months ago
- [IJCV'24] AutoStory: Generating Diverse Storytelling Images with Minimal Human Effort☆151Updated 11 months ago
- Grounding DINO with Segment Anything & Stable Diffusion colab☆194Updated last year
- Multimodal chatbot with computer vision capabilities integrated, our 1st-gen LMM☆101Updated last year
- ☆196Updated last year
- ☆57Updated last year
- A cli program of image retrieval using dinov2☆78Updated 2 years ago
- Our 2nd-gen LMM☆34Updated last year
- ☆79Updated last year
- Generate image from anything with ImageBind and Stable Diffusion☆198Updated 2 years ago
- The code of the paper "NExT-Chat: An LMM for Chat, Detection and Segmentation".☆253Updated last year
- Mini-DALLE3: Interactive Text to Image by Prompting Large Language Models☆313Updated last year
- [TIP 2025] CharacterFactory: Sampling Consistent Characters with GANs for Diffusion Models 🔥☆219Updated 6 months ago
- [ICCV2023] TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance☆114Updated last year
- Offical Code for GPT4Video: A Unified Multimodal Large Language Model for lnstruction-Followed Understanding and Safety-Aware Generation☆142Updated last year
- [ECCV 2022] AutoTransition: Learning to Recommend Video Transition Effects☆65Updated 8 months ago
- ☆21Updated 2 years ago
- Research Code for Multimodal-Cognition Team in Ant Group☆169Updated 3 weeks ago
- [ICCV2023] Segment Every Reference Object in Spatial and Temporal Spaces☆236Updated 8 months ago
- AutoShot: A Short Video Dataset and State-of-the-Art Shot Boundary Detection - CVPR NAS 2023☆197Updated 2 years ago
- [ACL 2024] GroundingGPT: Language-Enhanced Multi-modal Grounding Model☆340Updated last year