This project demonstrates how to use the Qwen2-VL model from Hugging Face for Optical Character Recognition (OCR) and Visual Question Answering (VQA). The model combines vision and language capabilities, enabling users to analyze images and generate context-based responses.
☆26Oct 18, 2024Updated last year
Alternatives and similar repositories for Qwen2-VL-OCR-VQA
Users that are interested in Qwen2-VL-OCR-VQA are comparing it to the libraries listed below
Sorting:
- 生成训练文本检测数据集☆12Jul 1, 2020Updated 5 years ago
- A simple Streamlit frontend for a pre-trained MobileNet CNN model + OpenCV for face mask detection in images.☆10Mar 25, 2023Updated 2 years ago
- A template for a Djinni library that can be used in Java/Kotlin, ObjC/Swift and C#☆11Oct 6, 2022Updated 3 years ago
- Continuous quality evaluation of ML algorithms via CI/CD and GitHub Actions.☆16Jan 15, 2020Updated 6 years ago
- ddc ci utility for linux which live in you tray. Brightnress, sound and input.☆31Jul 3, 2025Updated 8 months ago
- A Fast Image Converter thats supports common image formats. It's using WebAssembly for all conversions so no image is sent to the server…☆11Jul 10, 2025Updated 7 months ago
- Amlogic G12A Mali support for Mali Bifrost based SoCs, for Mainline Linux only☆11Jan 28, 2023Updated 3 years ago
- Aggregation framework for annotating datasets in computer vision tasks (detection, segmentation, video captioning etc.)☆11Nov 6, 2024Updated last year
- The action uses a GitHub repository as a cache for a .conan directory to speed up very slow builds.☆13Nov 2, 2021Updated 4 years ago
- Fast-Forward Video Based on Semantic Extraction @ 2016 IEEE International Conference on Image Processing (ICIP)☆11Oct 28, 2019Updated 6 years ago
- VINS: Visual Search for Mobile User Interface Design☆49Jan 9, 2021Updated 5 years ago
- Port of gst-editor to Gtk+ 3 and GStreamer 1.0☆13Dec 16, 2016Updated 9 years ago
- ☆12Feb 17, 2026Updated 2 weeks ago
- ☆12Feb 23, 2023Updated 3 years ago
- Detect Credit card number using Mask RCNN and make task easier for OCR to retrive number from the card☆11Oct 8, 2019Updated 6 years ago
- Multimodal object tracking and scene analytics for highly actionable, real-world contextualized data☆36Updated this week
- ☆12May 22, 2023Updated 2 years ago
- transformer-based ocr model☆14Jul 27, 2022Updated 3 years ago
- A course on Hugging Face land☆24Oct 9, 2025Updated 5 months ago
- 🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.☆12Jan 17, 2021Updated 5 years ago
- Python and JS tools to generate Printed LaTex formulas and images☆16Oct 26, 2023Updated 2 years ago
- ☆11Mar 24, 2023Updated 2 years ago
- Generation of handwritten cyrillic text using fonts☆12Mar 27, 2023Updated 2 years ago
- edge/mobile transformer based Vision DNN inference benchmark☆16Aug 29, 2025Updated 6 months ago
- LaTeXDataHub is an open-source platform dedicated to the sharing and contribution of real-world LaTeX image datasets and their annotation…☆12Aug 13, 2024Updated last year
- ☆13Apr 9, 2019Updated 6 years ago
- ♻️ iOS application that connects users to local e-waste recycling locations in real time. Featured by Metro News, CANews Ottawa, and Otta…☆15Jan 18, 2018Updated 8 years ago
- Code for the Human-related Object Detection based on Natural Language Parsing of Image Query Expressions article☆13Aug 8, 2017Updated 8 years ago
- ☆10Aug 15, 2023Updated 2 years ago
- The home of Stambecco 🦌: Italian Instruction-following LLaMA Model☆19Apr 16, 2023Updated 2 years ago
- Searching the location of a template or a target image.☆12Jul 23, 2019Updated 6 years ago
- Stamp and signs detection on images using OpenCV and clustering☆13Sep 19, 2019Updated 6 years ago
- Multi-platform, single executable HTTP proxy connecting through SSH tunnels☆10Jul 2, 2016Updated 9 years ago
- [ACL 2025] RealHiTBench: A Comprehensive Realistic Hierarchical Table Benchmark for Evaluating LLM-Based Table Analysis☆24Aug 8, 2025Updated 7 months ago
- Simple REST Yandex.Disk Client☆16Jan 5, 2020Updated 6 years ago
- A very simple tool to rewrite parameters such as attributes and constants for OPs in ONNX models. Simple Attribute and Constant Modifier …☆15Feb 6, 2026Updated last month
- OCR Web app with `easyocr` and `streamlit`☆17Sep 8, 2021Updated 4 years ago
- An open-source tool created by OctoML that converts TVM-optimized models to code runnable in ONNX Runtime.☆17Mar 30, 2023Updated 2 years ago
- Deep learning classifier and image generator for building architecture.☆12Dec 14, 2018Updated 7 years ago