visresearch / MultipleObjectStitchingLinks
The official code of "Multiple Object Stitching for Unsupervised Representation Learning"
☆17Updated 7 months ago
Alternatives and similar repositories for MultipleObjectStitching
Users that are interested in MultipleObjectStitching are comparing it to the libraries listed below
Sorting:
- Official Pytorch Implementation of Self-emerging Token Labeling☆35Updated last year
- ☆17Updated 6 months ago
- ☆73Updated 6 months ago
- My personal implementation of the model from "Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities", they haven't rel…☆12Updated 2 years ago
- [ICLR'26] Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs☆96Updated last week
- ☆30Updated 2 weeks ago
- SAM-CLIP module for use with Autodistill.☆17Updated 2 years ago
- REACT (CVPR 2023, Highlight 2.5%)☆142Updated 2 years ago
- A benchmark dataset and simple code examples for measuring the perception and reasoning of multi-sensor Vision Language models.☆19Updated last year
- Repository for the paper: "TiC-CLIP: Continual Training of CLIP Models" ICLR 2024☆111Updated last year
- EdgeSAM model for use with Autodistill.☆29Updated last year
- This library supports evaluating disparities in generated image quality, diversity, and consistency between geographic regions.☆20Updated last year
- [NeurIPS 2024] Empirical Lessons Toward Memory-Efficient and Fast Diffusion Models for Text-to-Image Synthesis☆146Updated 3 weeks ago
- [Technical Report] Official PyTorch implementation code for realizing the technical part of Phantom of Latent representing equipped with …☆63Updated last year
- ViT trained on COYO-Labeled-300M dataset☆33Updated 3 years ago
- ☆87Updated 2 years ago
- [CVPR 2025] DocLayLLM: An Efficient Multi-modal Extension of Large Language Models for Text-rich Document Understanding☆25Updated last month
- [NeurIPS 2022] code for "K-LITE: Learning Transferable Visual Models with External Knowledge" https://arxiv.org/abs/2204.09222☆53Updated 2 years ago
- A UI designer for constructing AI applications with OpenSearch☆16Updated last week
- ☆18Updated 2 years ago
- research work on multimodal cognitive ai☆68Updated last month
- VimTS: A Unified Video and Image Text Spotter☆79Updated last year
- Official PyTorch implementation of `[ACMMM 2023]Relational Contrastive Learning for Scene Text Recognition`☆17Updated 2 years ago
- Implementation of the "the first large-scale multimodal mixture of experts models." from the paper: "Multimodal Contrastive Learning with…☆36Updated last week
- Data Programming for Text Detection in Documents using SPEAR☆12Updated 10 months ago
- [AAAI2025] ChatterBox: Multi-round Multimodal Referring and Grounding, Multimodal, Multi-round dialogues☆60Updated 9 months ago
- Evaluate the performance of computer vision models and prompts for zero-shot models (Grounding DINO, CLIP, BLIP, DINOv2, ImageBind, model…☆37Updated 2 years ago
- Official repository for K-EXAONE built by LG AI Research☆66Updated last week
- ☆17Updated 7 months ago
- Vision-oriented multimodal AI☆51Updated last year