☆90May 20, 2025Updated 10 months ago
Alternatives and similar repositories for textvqa_grounding_task_qwen2.5-vl-ft
Users that are interested in textvqa_grounding_task_qwen2.5-vl-ft are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆13Jan 3, 2024Updated 2 years ago
- Comprehensive benchmark for video text understanding☆28Jun 4, 2025Updated 10 months ago
- [Neurips 24 Spotlight] Training in Pairs + Inference on Single Image with Anchors☆49Feb 20, 2025Updated last year
- 使用opencv部署yolo11表格检测,它是百度网盘AI大赛-表格检测的第2名方案,方案里包含表格框检测,表格角点检测,表格方向分类,一共三个模块。我依然是编写了C++和Python两个版本的程序☆13Dec 12, 2024Updated last year
- Multiple-Person Multi-Camera Tracker☆13Feb 17, 2017Updated 9 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- dancetrack 比赛第二名☆13Jan 29, 2023Updated 3 years ago
- Towards Training-free Open-world Segmentation via Image Prompt Foundation Models,☆18Nov 22, 2024Updated last year
- Add YOLOv3_tiny and data augment(clip, brighten, change saturation)☆14Jan 14, 2021Updated 5 years ago
- [ACCV 2024 (Oral, Best Application Paper)] Official Implementation of NT-VOT211: A Large-Scale Benchmark for Night-time Visual Object Tra…☆15Dec 30, 2025Updated 3 months ago
- ☆24Mar 6, 2026Updated last month
- EchoInk-R1: Exploring Audio-Visual Reasoning in Multimodal LLMs via Reinforcement Learning [🔥The Exploration of R1 for General Audio-Vi…☆78May 18, 2025Updated 11 months ago
- shopee integration for n8n☆15Sep 20, 2024Updated last year
- This repo contains implementation of deep learning-based steel surface defect segmentation models. Extensive experiments on several deep …☆22May 19, 2025Updated 11 months ago
- (CVPR 2025 Highlight) Official repository of paper "AODRaw: Towards RAW Object Detection in Diverse Conditions" (https://arxiv.org/pdf/24…☆24Apr 6, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [ECCV2024] ModTr: Modality Translation for Object Detection Adaptation Without Forgetting Prior Knowledge☆19Nov 28, 2024Updated last year
- Conditional EEG diffusion model☆17Apr 5, 2024Updated 2 years ago
- 高光谱图像计算 机视觉分类图像预处理工具集,包含去除图片无关背景,数据增强,生成标签文件等功能☆18Nov 4, 2023Updated 2 years ago
- The objective of this project is to demonstrate how to fine-tune deepseek-janus-pro-lora.☆38Jun 8, 2025Updated 10 months ago
- This is a simple toolkit to view and crop image patches for image/video super-resolution tasks.☆11Jan 6, 2023Updated 3 years ago
- unofficial implementation of https://arxiv.org/pdf/2301.08871v1.pdf on pytorch☆15Apr 20, 2023Updated 2 years ago
- (NeurIPS 2024) BiDM: Pushing the Limit of Quantization for Diffusion Models☆22Nov 20, 2024Updated last year
- Co-DETR (Detection Transformer) compiled from PyTorch to NVIDIA TensorRT☆20Apr 19, 2025Updated last year
- The Codes and Data of A Comprehensive Benchmark for Multimodal Large Language Models in Industrial Anomaly Detection [ICLR'25]☆246Jan 14, 2026Updated 3 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- [CVPR 2026] VGGDrive: Empowering Vision-Language Models with Cross-View Geometric Grounding for Autonomous Driving☆85Mar 10, 2026Updated last month
- Implementation of popular deep learning networks with TensorRT network definition APIs☆10Mar 25, 2021Updated 5 years ago
- Official implementation of Instance-wise and Center-of-Instance (ICI) segmentation loss☆12Oct 6, 2023Updated 2 years ago
- ☆22Jun 19, 2024Updated last year
- [ECCV 2024] Official Implementation of "Disentangling Masked Autoencoders for Unsupervised Domain Generalization"☆14Jul 31, 2024Updated last year
- A Framework for Symbolic MUsic Graph Explanations☆10Jul 30, 2025Updated 8 months ago
- 基于vllm部署qwen2.5_vl实现视频流的实时识别☆20Apr 1, 2025Updated last year
- Project for SNARE benchmark☆11Jun 5, 2024Updated last year
- 通用数字人系统是一个基于深度学习和WebRTC技术的智能交互平台,集成了Azure Avatar数字人渲染、语音识别合成、自然语言处理等技术。系统支持实时对话、知识问答和情感交互,可实现30FPS以上的流畅渲染和200ms以内的低延迟响应。核心功能包括基于GPT的智能对话、…☆31Dec 17, 2025Updated 4 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆12Jun 1, 2024Updated last year
- Agentic Keyframe Search for Video Question Answering☆18Apr 7, 2025Updated last year
- PyTorch Implementation of Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model☆28Oct 10, 2024Updated last year
- The official code of Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval (AAAI2024)☆32Mar 29, 2024Updated 2 years ago
- ☆11Nov 3, 2021Updated 4 years ago
- Official repository of "Interactive Text-to-Image Retrieval with Large Language Models: A Plug-and-Play Approach" (ACL 2024 Oral)☆35Mar 24, 2025Updated last year
- A Large-Scale Blind Image Quality Assessment Database☆17Jul 18, 2023Updated 2 years ago