☆58Jul 1, 2024Updated last year
Alternatives and similar repositories for AICITY2024_Track2_AliOpenTrek_CityLLaVA
Users that are interested in AICITY2024_Track2_AliOpenTrek_CityLLaVA are comparing it to the libraries listed below
Sorting:
- AICITY2024 Track 2 - Code from AIO_ISC Team☆37Jul 13, 2024Updated last year
- ☆52Jun 16, 2025Updated 8 months ago
- ☆16Mar 26, 2025Updated 11 months ago
- ☆13Dec 6, 2024Updated last year
- UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model☆22Aug 5, 2024Updated last year
- Object detection and classification☆12Oct 19, 2018Updated 7 years ago
- ☆20Updated this week
- ☆42Sep 15, 2025Updated 5 months ago
- ☆36Feb 8, 2026Updated last month
- ☆14Oct 15, 2024Updated last year
- 深度学习初学者理论与实践学习的资料总结☆13Apr 19, 2019Updated 6 years ago
- TransSimHub is a lightweight Python library for simulating and controlling transportation systems.☆52Feb 13, 2026Updated 3 weeks ago
- Official Repo for ICCV25-Video2BEV: Transforming Drone Videos to BEVs for Video-based Geo-localization☆27Feb 4, 2026Updated last month
- Pytorch implementation of the paper 'Towards Scenario Generalization for Vision-based Roadside 3D Object Detection'☆17Mar 9, 2025Updated last year
- Official pytorch implementation of the ICML2024 main conference paper: Pedestrian Attribute Recognition as Label-balanced Multi-label Lea…☆13Jul 22, 2024Updated last year
- ☆41Jan 4, 2026Updated 2 months ago
- [CVPRW 2024] TrafficVLM: A Controllable Visual Language Model for Traffic Video Captioning. Official code for the 3rd place solution of t…☆52Feb 11, 2025Updated last year
- ☆50Jun 30, 2024Updated last year
- This is a project about Optical Character Recognition (OCR) in Vietnamese texts by using PaddleOCR and VietOCR.☆27Mar 19, 2024Updated last year
- Simulated Chinese License Plate Character images☆18Jan 22, 2021Updated 5 years ago
- An Efficient and Realistic Traffic Simulator with Embedded Machine Learning Models☆23Dec 19, 2024Updated last year
- rmp data ranking☆13Nov 4, 2025Updated 4 months ago
- [CVPR 2025] DynRefer: Delving into Region-level Multimodal Tasks via Dynamic Resolution☆59Mar 4, 2025Updated last year
- Code and models of paper " Chained Multi-stream Networks Exploiting Pose, Motion, and Appearance for Action Classification and Detection"…☆27Aug 10, 2018Updated 7 years ago
- Enhancing Large Vision Language Models with Self-Training on Image Comprehension.☆69May 31, 2024Updated last year
- A Text2SQL benchmark for evaluation of Large Language Models☆41Updated this week
- The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…☆12Jun 28, 2025Updated 8 months ago
- ☆124Jul 29, 2024Updated last year
- ☆68Dec 7, 2025Updated 3 months ago
- An experiment to see if we can process G2 reviews to extract topics from reviews☆10Feb 5, 2024Updated 2 years ago
- [NeurIPS ENLSP Workshop'24] CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios☆16Oct 18, 2024Updated last year
- [ICCVW2025] V-RoAst: A New Dataset for Visual Road Assessment☆11Dec 17, 2025Updated 2 months ago
- ☆18Jun 10, 2025Updated 8 months ago
- xKV: Cross-Layer SVD for KV-Cache Compression☆45Nov 30, 2025Updated 3 months ago
- ☆36Jul 1, 2024Updated last year
- A lightweight flexible Video-MLLM developed by TencentQQ Multimedia Research Team.☆74Oct 14, 2024Updated last year
- This repository contains the code for the paper "Traffic Camera Calibration via Vehicle Vanishing Point Detection" (ICANN 2021)☆31Jan 23, 2024Updated 2 years ago
- Official implementation of "Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation" (CVPR 202…☆40May 26, 2025Updated 9 months ago
- [ICLR2025] Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want☆95Dec 1, 2025Updated 3 months ago