PCIResearch / TransCore-MLinks
Large Multimodal Model
☆15Updated last year
Alternatives and similar repositories for TransCore-M
Users that are interested in TransCore-M are comparing it to the libraries listed below
Sorting:
- Lion: Kindling Vision Intelligence within Large Language Models☆52Updated last year
- MLLM-DataEngine: An Iterative Refinement Approach for MLLM☆46Updated last year
- A subset of YFCC100M. Tools, checking scripts and links of web drive to download datasets(uncompressed).☆19Updated 8 months ago
- ☆22Updated last year
- ☆19Updated last year
- ChineseCLIP using online learning☆13Updated 2 years ago
- ☆69Updated 2 years ago
- [ACM MM2025] The official repository for the RealSyn dataset☆35Updated last week
- ☆18Updated 2 years ago
- Scaling Multi-modal Instruction Fine-tuning with Tens of Thousands Vision Task Types☆24Updated this week
- official code for "Modality Curation: Building Universal Embeddings for Advanced Multimodal Information Retrieval"☆24Updated last week
- Train InternViT-6B in MMSegmentation and MMDetection with DeepSpeed☆94Updated 8 months ago
- Large-batch Optimization for Dense Visual Predictions (NeurIPS 2022)☆57Updated 2 years ago
- ☆87Updated last year
- Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models☆62Updated 8 months ago
- Rex-Thinker: Grounded Object Refering via Chain-of-Thought Reasoning☆88Updated 2 weeks ago
- The proposed simulated dataset consisting of 9,536 charts and associated data annotations in CSV format.☆25Updated last year
- A huge dataset for Document Visual Question Answering☆19Updated 11 months ago
- Chinese CLIP models with SOTA performance.☆55Updated last year
- OpenMMLab Detection Toolbox and Benchmark for V3Det☆15Updated last year
- 从零到一实现了一个多模态大模型,并命名为Reyes(睿视),R:睿,eyes:眼。Reyes的参数量为8B,视觉编码器使用的是InternViT-300M-448px-V2_5,语言模型侧使用的是Qwen2.5-7B-Instruct,Reyes也通过一个两层MLP投影层连…☆22Updated 5 months ago
- [ACM MM 2021, TMM 2023] Disentangle your Dense Object Detector☆61Updated 3 years ago
- A Simple Framework of Small-scale LMMs for Video Understanding☆71Updated last month
- Official implementation of paper "Masked Distillation with Receptive Tokens", ICLR 2023.☆69Updated 2 years ago
- [CVPR2023] This is an official implementation of paper "DETRs with Hybrid Matching".☆14Updated 2 years ago
- ☆64Updated last month
- Research Code for Multimodal-Cognition Team in Ant Group☆154Updated last week
- ☆91Updated last year
- [IJCV 2024] TransDETR: End-to-end Video Text Spotting with Transformer☆104Updated last year
- Pruning the VLLMs☆97Updated 7 months ago