ChenAnno / Real20M_ACMMM2023View external linksLinks
Official implementation for "Real20M: A Large-scale E-commerce Dataset for Cross-domain Retrieval"
☆25Oct 27, 2025Updated 3 months ago
Alternatives and similar repositories for Real20M_ACMMM2023
Users that are interested in Real20M_ACMMM2023 are comparing it to the libraries listed below
Sorting:
- Official implementation for "FashionERN: Enhance-and-Refine Network for Composed Fashion Image Retrieval"☆19Oct 27, 2025Updated 3 months ago
- Lion: Kindling Vision Intelligence within Large Language Models☆51Jan 25, 2024Updated 2 years ago
- ☆17Mar 5, 2025Updated 11 months ago
- A digital twin of the city of Chicago along with automated sensors☆12Nov 14, 2019Updated 6 years ago
- ☆10Oct 25, 2024Updated last year
- Source code of ICML'22 paper: FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting☆10Jun 10, 2022Updated 3 years ago
- ☆12Jan 10, 2025Updated last year
- ☆10Mar 25, 2022Updated 3 years ago
- ☆12Jan 8, 2021Updated 5 years ago
- "FORB: A Flat Object Retrieval Benchmark for Universal Image Embedding", NeurIPS 2023 Datasets and Benchmarks Track☆12Jun 20, 2024Updated last year
- AdaCrowd: Unlabeled Scene Adaptation for Crowd Counting (TMM 2021)☆10Feb 24, 2021Updated 4 years ago
- Transfer Knowledge Learned from Multiple Domains for Time-series Data Prediction☆12Apr 20, 2018Updated 7 years ago
- Data-Efficient Multimodal Fusion on a Single GPU☆68May 7, 2024Updated last year
- yolact, fcos, gluoncv☆14Nov 28, 2022Updated 3 years ago
- Code for paper: Unified Text-to-Image Generation and Retrieval☆16Jul 6, 2024Updated last year
- The application of large pre-trained vision model DINOv2 from MetaAI for feature points matching, and a ViT decoder used for Auto Encoder☆17Apr 27, 2023Updated 2 years ago
- [AAAI 2025] Official code for "OmniCount: Multi-label Object Counting with Semantic-Geometric Priors"☆21Sep 30, 2025Updated 4 months ago
- In this work, we implement different cross-modal learning schemes such as Siamese Network, Correlational Network and Deep Cross-Modal Pro…☆11Aug 23, 2021Updated 4 years ago
- ☆64Feb 1, 2026Updated last week
- ☆22Sep 9, 2025Updated 5 months ago
- [EMNLP 2024] Official code for "Beyond Embeddings: The Promise of Visual Table in Multi-Modal Models"☆20Oct 17, 2024Updated last year
- ☆18Aug 23, 2022Updated 3 years ago
- ☆14Oct 14, 2021Updated 4 years ago
- Implementation on pytorch of the code from the ECCV 2018 paper - Single Shot Scene Text Retrieval☆13Dec 15, 2021Updated 4 years ago
- ☆17Oct 7, 2022Updated 3 years ago
- Mxnet2Caffe_Tensor RT☆18Apr 20, 2019Updated 6 years ago
- The official codebase for Reflected Flow Matching (ICML 2024)☆22Jun 19, 2024Updated last year
- MindWatcher: Toward Smarter Multimodal Tool-Integrated Reasoning☆39Dec 30, 2025Updated last month
- The official GitHub repo for the paper MOAT: Evaluating LMMs for Capability Integration and Instruction Grounding.☆22Dec 29, 2025Updated last month
- Code release for "Understanding Bias in Large-Scale Visual Datasets"☆22Dec 4, 2024Updated last year
- UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model☆22Aug 5, 2024Updated last year
- Real-Time and Accurate Object Detection in Compressed Video by Long Short-term Feature Aggregation☆21Apr 13, 2021Updated 4 years ago
- Code implementation of our ICCV 2025 paper: On Large Multimodal Models as Open-World Image Classifiers☆26Dec 4, 2025Updated 2 months ago
- 【ICLR 2024, Spotlight】Sentence-level Prompts Benefit Composed Image Retrieval☆92Apr 16, 2024Updated last year
- ☆26Oct 19, 2022Updated 3 years ago
- ENTIRe-ID☆25Jul 13, 2024Updated last year
- [NeurIPS 2023] Bootstrapping Vision-Language Learning with Decoupled Language Pre-training☆26Dec 5, 2023Updated 2 years ago
- Object Detection for Video with MXNet and GluonCV using YOLOv3☆22Nov 21, 2022Updated 3 years ago
- Implementation of "Single Shot Video Object Detector"☆23Mar 25, 2020Updated 5 years ago