This project demonstrates how to use the Qwen2-VL model from Hugging Face for Optical Character Recognition (OCR) and Visual Question Answering (VQA). The model combines vision and language capabilities, enabling users to analyze images and generate context-based responses.
☆27Oct 18, 2024Updated last year
Alternatives and similar repositories for Qwen2-VL-OCR-VQA
Users that are interested in Qwen2-VL-OCR-VQA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Repository for the KVP10k dataset☆22Sep 18, 2025Updated 7 months ago
- Face anti-spoofing model, python/pytorch☆16Dec 19, 2023Updated 2 years ago
- The Safari browser does not adjust the view layout size when activating the virtual keyboard on mobile phones. You can see the difference…☆19Sep 28, 2023Updated 2 years ago
- object tracker for VOT☆10Jun 22, 2016Updated 9 years ago
- ☆10Dec 9, 2018Updated 7 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Classification of Day-Night images by Image Processing using OpenCV☆18May 6, 2024Updated last year
- tmp DPI☆14Dec 18, 2024Updated last year
- A template for a Djinni library that can be used in Java/Kotlin, ObjC/Swift and C#☆11Oct 6, 2022Updated 3 years ago
- Web service for image file/image URL classification without uploading.☆16May 27, 2022Updated 3 years ago
- Code for the Human-related Object Detection based on Natural Language Parsing of Image Query Expressions article☆13Aug 8, 2017Updated 8 years ago
- Optimizing Monocular Depth Estimation with TensorRT: Model Conversion, Inference Acceleration, and 3D Reconstruction☆44Mar 9, 2026Updated last month
- 🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.☆12Jan 17, 2021Updated 5 years ago
- 生成训练文本检测数据集☆12Jul 1, 2020Updated 5 years ago
- ☆33May 18, 2023Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Examples of cleaning up raw voices☆18Mar 2, 2022Updated 4 years ago
- Continuous quality evaluation of ML algorithms via CI/CD and GitHub Actions.☆16Jan 15, 2020Updated 6 years ago
- ☆14Aug 10, 2019Updated 6 years ago
- Teaching a Convolutional Neural Network to recognize painting genre. Handcrafted dataset. Cool visualizations.☆10Dec 19, 2018Updated 7 years ago
- ☆13Apr 9, 2019Updated 7 years ago
- The action uses a GitHub repository as a cache for a .conan directory to speed up very slow builds.☆13Nov 2, 2021Updated 4 years ago
- ☆57Nov 17, 2017Updated 8 years ago
- ☆10Jun 11, 2025Updated 10 months ago
- Port of gst-editor to Gtk+ 3 and GStreamer 1.0☆13Dec 16, 2016Updated 9 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A simple Streamlit frontend for a pre-trained MobileNet CNN model + OpenCV for face mask detection in images.☆10Mar 25, 2023Updated 3 years ago
- Amlogic G12A Mali support for Mali Bifrost based SoCs, for Mainline Linux only☆11Jan 28, 2023Updated 3 years ago
- 🚀 Easy to use time and date picker with lots of options for React Native 🥳☆36Jul 10, 2023Updated 2 years ago
- A very simple tool to rewrite parameters such as attributes and constants for OPs in ONNX models. Simple Attribute and Constant Modifier …☆15Feb 6, 2026Updated 2 months ago
- Learning to Prune: Exploring the Frontier of Fast and Accurate Parsing☆22Sep 24, 2024Updated last year
- Build TVM docker image for production compilation deployments☆12Sep 7, 2021Updated 4 years ago
- ddc ci utility for linux which live in you tray. Brightnress, sound and input.☆31Mar 25, 2026Updated 3 weeks ago
- Multi-platform, single executable HTTP proxy connecting through SSH tunnels☆10Jul 2, 2016Updated 9 years ago
- generative models for speech☆20Jul 4, 2016Updated 9 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- YOLOXとByteTrackを用いたMOT(Multiple Object Tracking)のPythonサンプル☆23Feb 1, 2023Updated 3 years ago
- LaTeXDataHub is an open-source platform dedicated to the sharing and contribution of real-world LaTeX image datasets and their annotation…☆12Aug 13, 2024Updated last year
- ☆17Jul 30, 2024Updated last year
- Detect Credit card number using Mask RCNN and make task easier for OCR to retrive number from the card☆11Oct 8, 2019Updated 6 years ago
- Little image retouching application for Linux Desktop (Development)☆14Feb 22, 2022Updated 4 years ago
- Easy to download and parse version of the Smartdoc 2015 - Challenge 1 dataset.☆15Mar 5, 2018Updated 8 years ago
- ☆13May 22, 2023Updated 2 years ago