vivien000 / clip-demo
Minimal user-friendly demo of OpenAI's CLIP for semantic image search
☆14Updated 4 months ago
Alternatives and similar repositories for clip-demo:
Users that are interested in clip-demo are comparing it to the libraries listed below
- ☆89Updated last year
- An official codebase for paper " CHAMPAGNE: Learning Real-world Conversation from Large-Scale Web Videos (ICCV 23)"☆52Updated last year
- ☆32Updated 2 years ago
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆34Updated 2 months ago
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆76Updated 4 months ago
- LoRA fine-tuned Stable Diffusion Deployment☆31Updated 2 years ago
- ☆31Updated 10 months ago
- ☆15Updated 2 years ago
- We identify the desiderata for a comprehensive benchmark and propose Visually Rich Document Understanding (VRDU). VRDU contains two datas…☆77Updated 2 years ago
- ☆57Updated 7 months ago
- [COLM 2024] Early Weight Averaging meets High Learning Rates for LLM Pre-training☆15Updated 4 months ago
- Repository for Multilingual-VQA task created during HuggingFace JAX/Flax community week.☆34Updated 3 years ago
- ☆17Updated 9 months ago
- ☆64Updated last year
- Mr. Right: Multimodal Retrieval on Representation of ImaGe witH Text☆24Updated 2 years ago
- ☆44Updated 3 years ago
- Implementation for the CVPR 2023 paper "Improving Selective Visual Question Answering by Learning from Your Peers" (https://arxiv.org/abs…☆24Updated last year
- Using pretrained encoder and language models to generate captions from multimedia inputs.☆94Updated last year
- ☆13Updated last year
- Extract information, summarize, ask questions, and search videos using OpenAI's Vision API 🚀🎦☆62Updated last year
- ☆27Updated 3 weeks ago
- The collection of bulding blocks building fine-tunable metric learning models☆32Updated last month
- Application for searching images from natural language queries☆46Updated 3 years ago
- This project is a collection of fine-tuning scripts to help researchers fine-tune Qwen 2 VL on HuggingFace datasets.☆63Updated 5 months ago
- Source code for the GPT-2 story generation models in the EMNLP 2020 paper "STORIUM: A Dataset and Evaluation Platform for Human-in-the-Lo…☆39Updated last year
- ☆68Updated 8 months ago
- ☆45Updated 4 months ago
- 🎨 Imagine what Picasso could have done with AI. Self-host your StableDiffusion API.☆50Updated last year
- This project breathes life into video characters by using AI to describe their personality and then chat with you as them.☆45Updated 11 months ago
- Example codebase for fine-tuning layoutLMv3 on DocVQA☆50Updated 2 years ago