xiteng01 / CVPR2023_foundation_model_Track1Links
Workshop on Foundation Model 1st foundation model challenge Track1 codebase (Open TransMind v1.0)
☆18Updated 2 years ago
Alternatives and similar repositories for CVPR2023_foundation_model_Track1
Users that are interested in CVPR2023_foundation_model_Track1 are comparing it to the libraries listed below
Sorting:
- ☆26Updated 10 months ago
- Large Multimodal Model☆15Updated last year
- 国内外数据竞赛资讯整理☆18Updated 3 years ago
- The official repo for “TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding”.☆40Updated 9 months ago
- ☆40Updated last year
- This is code of paper "ScalableViT: Rethinking the Context-oriented Generalization of Vision Transformer"☆26Updated last year
- Zone Evaluation: Revealing Spatial Bias in Object Detection (TPAMI 2024)☆46Updated 6 months ago
- Towards Efficient and Effective Text-to-Video Retrieval with Coarse-to-Fine Visual Representation Learning☆17Updated 4 months ago
- 1st solution for the Webly-supervised Fine-grained Recognition competition in https://www.cvmart.net/race/10412/base☆35Updated 2 years ago
- ChineseCLIP using online learning☆13Updated 2 years ago
- convert paddleOCR to torchOCR, ppocr-v3,ppocr-v4, onnx, openvino☆32Updated last year
- [ICME 2023] FlowText: Synthesizing Realistic Scene Text Video with Optical Flow Estimation☆11Updated 2 years ago
- 图像质量评估☆21Updated 5 years ago
- Pytorch、Numpy实现NMS、Soft-NMS代码☆12Updated 4 years ago
- ☆57Updated last year
- A multimodal image search engine built on the GME model, capable of handling diverse input types. Whether you're querying with text, imag…☆40Updated 2 weeks ago
- Searching a High Performance Feature Extractor for Text Recognition Network. TPAMI 2022☆13Updated 2 years ago
- This repo holds the competitions (information, solutions, summaries, memories) that our team has participated in☆26Updated last year
- Building a VLM model starts from the basic module.☆16Updated last year
- General Image Classification Code base☆21Updated 3 years ago
- Official implementation of paper "Masked Distillation with Receptive Tokens", ICLR 2023.☆69Updated 2 years ago
- DATE: Dual Assignment for End-to-End Fully Convolutional Object Detection☆42Updated 2 years ago
- ☆29Updated 3 years ago
- ☆16Updated 3 years ago
- 可以成功Lora微调的Qwen-VL模型☆18Updated last year
- 本项目使用LLaVA 1.6多模态模型实现以文搜图和以图搜图功能。☆23Updated last year
- Test different pooling method used in CNN for Computer Vision Task☆35Updated 4 years ago
- Official implementation of TagAlign☆35Updated 6 months ago
- ☆18Updated 2 years ago
- C++ and CUDA extensions for Python/Pytorch and GPU Accelerated Augmentation.☆35Updated 2 years ago