xiteng01 / CVPR2023_foundation_model_Track1Links
Workshop on Foundation Model 1st foundation model challenge Track1 codebase (Open TransMind v1.0)
☆18Updated 2 years ago
Alternatives and similar repositories for CVPR2023_foundation_model_Track1
Users that are interested in CVPR2023_foundation_model_Track1 are comparing it to the libraries listed below
Sorting:
- 国内外数据竞赛资讯整理☆18Updated 3 years ago
- ☆25Updated 9 months ago
- Large Multimodal Model☆15Updated last year
- Zone Evaluation: Revealing Spatial Bias in Object Detection (TPAMI 2024)☆44Updated 6 months ago
- Towards Efficient and Effective Text-to-Video Retrieval with Coarse-to-Fine Visual Representation Learning☆16Updated 3 months ago
- A subset of YFCC100M. Tools, checking scripts and links of web drive to download datasets(uncompressed).☆19Updated 6 months ago
- Searching a High Performance Feature Extractor for Text Recognition Network. TPAMI 2022☆13Updated 2 years ago
- This is code of paper "ScalableViT: Rethinking the Context-oriented Generalization of Vision Transformer"☆26Updated last year
- 补充了一些Visualglm缺少的文件,可以对Visualglm进行训练,实例中是对人脸做了面相的识别☆13Updated last year
- [ICME 2023] FlowText: Synthesizing Realistic Scene Text Video with Optical Flow Estimation☆11Updated 2 years ago
- 使用DBNet检测条形码,包含C++和Python两种版本的程序☆35Updated 4 years ago
- convert paddleOCR to torchOCR, ppocr-v3,ppocr-v4, onnx, openvino☆32Updated last year
- ☆28Updated last year
- Knowledge Distillation Toolbox for Semantic Segmentation☆17Updated 2 years ago
- ☆18Updated 2 years ago
- The official repo for “TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding”.☆40Updated 8 months ago
- ☆40Updated last year
- ☆56Updated last year
- This repo holds the competitions (information, solutions, summaries, memories) that our team has participated in☆26Updated last year
- DATE: Dual Assignment for End-to-End Fully Convolutional Object Detection☆41Updated last year
- ☆18Updated 2 years ago
- General Image Classification Code base☆21Updated 3 years ago
- Official implementation of paper "Masked Distillation with Receptive Tokens", ICLR 2023.☆68Updated 2 years ago
- ChineseCLIP using online learning☆13Updated 2 years ago
- Building a VLM model starts from the basic module.☆16Updated last year
- [ECCV 2020 Workshop] VIPirios Object Detection Champion☆44Updated last year
- A multimodal image search engine built on the GME model, capable of handling diverse input types. Whether you're querying with text, imag…☆36Updated 5 months ago
- ☆22Updated 2 years ago
- An implementation of MSSRM method☆11Updated 2 years ago
- Unified Architecture Search with Convolution, Transformer, and MLP (ECCV 2022)☆54Updated 2 years ago