☆57Jan 23, 2024Updated 2 years ago
Alternatives and similar repositories for Vary-family
Users that are interested in Vary-family are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official code implementation of Vary-toy (Small Language Model Meets with Reinforced Vision Vocabulary)☆629Dec 30, 2024Updated last year
- Vary-tiny codebase upon LAVIS (for training from scratch)and a PDF image-text pairs data (about 600k including English/Chinese)☆89Sep 21, 2024Updated last year
- Keypoint dataset for airplane☆10Dec 28, 2019Updated 6 years ago
- [ECCV 2024] Official code implementation of Vary: Scaling Up the Vision Vocabulary of Large Vision Language Models.☆1,889Dec 30, 2024Updated last year
- Accelerating GOT-OCRv2 with VLLM☆10Nov 15, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Official code implementation of Slow Perception:Let's Perceive Geometric Figures Step-by-step☆161Jul 28, 2025Updated 11 months ago
- official code for "Fox: Focus Anywhere for Fine-grained Multi-page Document Understanding"☆196May 31, 2024Updated 2 years ago
- Official implementation for Dessurt: Document end-to-end self-supervised understanding and recognition transformer☆62Jan 11, 2023Updated 3 years ago
- [ACM'MM 2024 Oral] Official code for "OneChart: Purify the Chart Structural Extraction via One Auxiliary Token"☆266Apr 14, 2025Updated last year
- A Dead Simple and Modularized Multi-Modal Training and Finetune Framework. Compatible to any LLaVA/Flamingo/QwenVL/MiniGemini etc series …☆19Apr 24, 2024Updated 2 years ago
- a family of highly capabale yet efficient large multimodal models☆193Aug 23, 2024Updated last year
- Minimal user-friendly demo of OpenAI's CLIP for semantic image search☆19Sep 28, 2024Updated last year
- Dataset and scripts for HRDoc☆41Jun 21, 2023Updated 3 years ago
- 研究GOT-OCR-项目落地加速,不限语言☆62Oct 24, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- comfyui的InternVL2插件,InternVL2是当前不错的开源多模态大语言模型,在文档vqa上表现很好☆13Aug 10, 2024Updated last year
- Official Repository of MMLONGBENCH-DOC: Benchmarking Long-context Document Understanding with Visualizations☆149Sep 28, 2025Updated 9 months ago
- ☆163May 8, 2025Updated last year
- 支持中英文双语视觉-文本对话的开源可商用多模态模型。☆379Sep 23, 2023Updated 2 years ago
- Code/Data for the paper: "LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding"☆268Jun 12, 2024Updated 2 years ago
- Chinese CLIP models with SOTA performance.☆62Aug 28, 2023Updated 2 years ago
- Using Llam.cpp and onnxruntime to accelerate inference of GOT-OCR2.0☆15Mar 6, 2025Updated last year
- ☆28Dec 11, 2025Updated 6 months ago
- [Paper] Code for the EMNLP2023 (Findings) paper "Global Structure Knowledge-Guided Relation Extraction Method for Visually-Rich Document"☆17Dec 1, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- [ACL 2024 Findings] Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruning☆13Sep 2, 2024Updated last year
- ☆189Feb 27, 2024Updated 2 years ago
- [ICLR2025] Official code implementation of Video-UTR: Unhackable Temporal Rewarding for Scalable Video MLLMs☆61Feb 27, 2025Updated last year
- PDF Parsing Tool: GOT's vLLM acceleration implementation, MinerU for layout recognition, and GOT for table formula parsing.☆65Nov 7, 2024Updated last year
- Code and Dataset for our paper: Layout-Aware Single-Image Document Flattening☆24Dec 16, 2024Updated last year
- This is the official repository of the revised datasets FUNSD-r and CORD-r, introduced in EMNLP 2023 paper Reading Order Matters: Informa…☆17Mar 20, 2024Updated 2 years ago
- Algorithms, papers, datasets, performance comparisons for Document AI.☆209Mar 1, 2025Updated last year
- 基于ncnn的android端的enet分割☆17Mar 29, 2020Updated 6 years ago
- Transformer related optimization, including BERT, GPT☆17Jul 29, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆143Feb 13, 2024Updated 2 years ago
- ☆19Mar 28, 2022Updated 4 years ago
- DocReal: Robust Document Dewarping of Real-Life Images via Attention-Enhanced Control Point Prediction☆30Jun 28, 2023Updated 3 years ago
- Implementation of Unsupervised Pixel–Level Domain Adaptation with Generative Adversarial Networks by Google☆15Jan 10, 2017Updated 9 years ago
- Dataset and Code for our ACL 2024 paper: "Multimodal Table Understanding". We propose the first large-scale Multimodal IFT and Pre-Train …☆227Jun 12, 2025Updated last year
- 中文原生工业测评基准☆17Mar 21, 2024Updated 2 years ago
- Data and code for ACL 2022 paper "MultiHiertt: Numerical Reasoning over Multi Hierarchical Tabular and Textual Data"☆54Oct 22, 2024Updated last year