☆14Jun 10, 2025Updated 11 months ago
Alternatives and similar repositories for KIE-HVQA
Users that are interested in KIE-HVQA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- DatasetImgLabeler is a image annotation tool for researchers to prepare datasets in ICDAR2015 format☆12Dec 7, 2019Updated 6 years ago
- 🎓Automatically Update LLM inference systems Papers Daily using Github Actions (Update Every 12th hours)☆12May 18, 2026Updated last week
- Implementation of "DIME-FM: DIstilling Multimodal and Efficient Foundation Models"☆15Oct 12, 2023Updated 2 years ago
- 这里将paddle中的ocr等模型转为onnx格式,并利用java版深度框架djl加载这些onnx模型进行推理预测尝试。☆14Nov 15, 2022Updated 3 years ago
- PyTorch implementation of EfficientNet-lite and a spectrum of pre-trained models on ImageNet☆11Mar 20, 2020Updated 6 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Dynamic Multi-Context Segmentation of Remote Sensing Images based on Convolutional Networks☆13May 16, 2019Updated 7 years ago
- [ACL 2026 Main] Revisit What You See: Revealing Visual Semantics in Vision Tokens to Guide LVLM Decoding☆25Nov 21, 2025Updated 6 months ago
- dbnet文字检测,添加文本框分类☆14Jul 27, 2022Updated 3 years ago
- ☆13Apr 9, 2026Updated last month
- ☆12Sep 8, 2022Updated 3 years ago
- Increasing the scale and diversity of chart de-rendering data.☆12Mar 13, 2024Updated 2 years ago
- Project page for the ICDAR 2023 Paper "Inv3D: a high-resolution 3D invoice dataset for template-guided single-image document unwarping".☆13Dec 21, 2023Updated 2 years ago
- ☆18Mar 19, 2021Updated 5 years ago
- VisuRiddles: Fine-grained Perception is a important thing for Multimodal Large Models in Riddles Solving☆18Oct 22, 2025Updated 7 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Trusted Mamba Contrastive Network for Multi-View Clustering☆16Dec 10, 2025Updated 5 months ago
- Hourglass shape network for remote sensing imagery semantic segmentation☆20Jun 4, 2018Updated 7 years ago
- ☆12Jun 12, 2024Updated last year
- Vision-Language Pre-Training for Boosting Scene Text Detectors (CVPR2022)☆12Mar 21, 2022Updated 4 years ago
- ☆16May 15, 2025Updated last year
- Unofficial implementation of DocMAE (WIP): Document Image Rectification via Self-supervised Representation Learning☆20Dec 20, 2023Updated 2 years ago
- ☆13Mar 16, 2021Updated 5 years ago
- pytorch大规模数据读取dataset☆13May 30, 2022Updated 4 years ago
- A community effort to translate fastai video lessons from English to Chinese☆14May 2, 2019Updated 7 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- [NAACL 2025] Beyond End-to-End VLMs: Leveraging Intermediate Text Representations for Superior Flowchart Understanding☆21Aug 23, 2025Updated 9 months ago
- The official code for "DaFIR: Distortion-Aware Representation Learning for Fisheye Image Rectification", TCSVT, 2023.☆13May 30, 2025Updated last year
- ☆15Jul 3, 2019Updated 6 years ago
- 中文版面检测(Chinese layout detection),yolov8 is used to detect the layout of Chinese document images。☆60Apr 28, 2023Updated 3 years ago
- Dockerfile for RL research. Including MuJoCo / DMC / PyTorch / Tensoflow / Atari support.☆16Jan 5, 2022Updated 4 years ago
- Code for "DAMEX: Dataset-aware Mixture-of-Experts for visual understanding of mixture-of-datasets", accepted at Neurips 2023 (Main confer…☆28Mar 29, 2024Updated 2 years ago
- The code for "AttentionPredictor: Temporal Pattern Matters for Efficient LLM Inference", Qingyue Yang, Jie Wang, Xing Li, Zhihai Wang, Ch…☆28Jul 15, 2025Updated 10 months ago
- Official implementation of "MMNeuron: Discovering Neuron-Level Domain-Specific Interpretation in Multimodal Large Language Model". Our co…☆26Dec 20, 2024Updated last year
- 文档图像处理工具(Document image processing tool),包括漂白 / 文字方向矫正 / 清晰增强 / 笔记去噪美化 / 去阴影 / 扭曲矫正 / 切边增强(DocBleach / TextOrientationCorrection / DocSha…☆132Aug 27, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Official implementation for Multi-Modal Interaction Graph Convolutional Network for Temporal Language Localization in Videos☆16May 23, 2023Updated 3 years ago
- [NeurIPS 2025 🔥] Official implementation for "Don't Just Chase “Highlighted Tokens” in MLLMs: Revisiting Visual Holistic Context Retenti…☆63Mar 5, 2026Updated 2 months ago
- Implement Code for UniMix and Bayias Compensated Loss☆19Mar 7, 2023Updated 3 years ago
- A drawable MNIST demo using streamlit.☆11Nov 27, 2020Updated 5 years ago
- [NeurIPS 2025 Spotlight] SparseMVC: Probing Cross-view Sparsity Variations for Multi-view Clustering [Pytorch repository]☆46May 21, 2026Updated last week
- Just for learning ffmpeg☆13Jul 11, 2022Updated 3 years ago
- Official code implementation of " TextDiff: Mask-Guided Residual Diffusion Models for Scene Text Image " in Pattern Recognition☆25Apr 24, 2024Updated 2 years ago