akaihaoshuai / crawler_paper
从ICCV等网页上爬取论文列表,并获取ArXiv的相关资料
☆14Updated last year
Alternatives and similar repositories for crawler_paper:
Users that are interested in crawler_paper are comparing it to the libraries listed below
- GroundVLP: Harnessing Zero-shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection (AAAI 2024)☆65Updated last year
- Open-vocabulary Semantic Segmentation☆34Updated last year
- Parameter-Efficient Fine-Tuning for Foundation Models☆57Updated 3 weeks ago
- Code for Retrieval-Augmented Perception (RAP)☆10Updated last month
- Codebase of ACL 2023 Findings "Aerial Vision-and-Dialog Navigation"☆50Updated 5 months ago
- A collection of strong multimodal models for building multimodal AGI agents☆41Updated 9 months ago
- ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration☆30Updated 3 months ago
- ☆56Updated last year
- 本项目使用LLaVA 1.6多模态模型实现以文搜图和以图搜图功能。☆21Updated last year
- LAVIS - A One-stop Library for Language-Vision Intelligence☆47Updated 8 months ago
- ☆47Updated last month
- The official implementation of the paper "MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding". …☆52Updated 5 months ago
- [ICCV2023] Official code for "VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity Control"☆53Updated last year
- This project aims to collect and collate various datasets for multimodal large model training, including but not limited to pre-training …☆39Updated 6 months ago
- [ECCV 2024] SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding☆56Updated 6 months ago
- [SCIS 2024] The official implementation of the paper "MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Di…☆49Updated 5 months ago
- Vision-oriented multimodal AI☆49Updated 10 months ago
- breezedeus的各种分享☆22Updated 2 years ago
- (ACM MM24) This is the offical repository of GIST: Improving Parameter Efficient Fine Tuning via Knowledge Interaction.☆10Updated last year
- "Towards Improving Document Understanding: An Exploration on Text-Grounding via MLLMs" 2023☆14Updated 4 months ago
- ☆23Updated 8 months ago
- [IGARSS 2025] A Simple Aerial Detection Baseline of Multimodal Language Models.☆66Updated last week
- ☆54Updated last month
- ☆19Updated 6 months ago
- ☆30Updated last year
- Repository of paper: Position-Enhanced Visual Instruction Tuning for Multimodal Large Language Models☆37Updated last year
- ☆40Updated last month
- ☆35Updated last year
- [NeurIPS2023] Parameter-efficient Tuning of Large-scale Multimodal Foundation Model☆86Updated last year
- Implementation of the model: "(MC-ViT)" from the paper: "Memory Consolidation Enables Long-Context Video Understanding"☆21Updated 3 weeks ago