multimodal document analysis
☆165Feb 28, 2026Updated last month
Alternatives and similar repositories for mmda
Users that are interested in mmda are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Incorporating VIsual LAyout Structures for Scientific Text Classification☆180Mar 18, 2023Updated 3 years ago
- SMASHED is a toolkit designed to apply transformations to samples in datasets, such as fields extraction, tokenization, prompting, batchi…☆35May 24, 2024Updated last year
- Implementation of DocFormer: End-to-End Transformer for Document Understanding, a multi-modal transformer based architecture for the task…☆287Feb 13, 2023Updated 3 years ago
- Software that makes labeling PDFs easy.☆428May 13, 2024Updated last year
- Benchmark dataset for the evaluation of scientific article representations on the task of citation recommendation across various scientif…☆12Oct 21, 2022Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Implementation of the paper: Going Full-TILT Boogie on Document Understanding with Text-Image-Layout Transformer.☆18Apr 23, 2023Updated 2 years ago
- Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understan…☆363Oct 31, 2022Updated 3 years ago
- ☆34Jan 2, 2024Updated 2 years ago
- code for participation in ICDAR2021 Table Recognition track (Team Name: LTIAYN = Kaen Context)☆22Jun 16, 2021Updated 4 years ago
- Index of URLs to pdf files all over the internet and scripts☆25May 2, 2023Updated 2 years ago
- library supporting NLP and CV research on scientific papers☆793Nov 8, 2024Updated last year
- DocBank: A Benchmark Dataset for Document Layout Analysis☆640Aug 12, 2024Updated last year
- Parsers for scientific papers (PDF2JSON, TEX2JSON, JATS2JSON)☆463Apr 11, 2024Updated 2 years ago
- A Benchmark of PDF Information Extraction Tools using a Multi-Task and Multi-Domain Evaluation Framework for Academic Documents☆31Dec 8, 2022Updated 3 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- A Unified Toolkit for Deep Learning Based Document Image Analysis☆5,711Aug 15, 2024Updated last year
- ☆18Oct 22, 2022Updated 3 years ago
- Japanese / English Bilingual LLM☆28Dec 23, 2025Updated 3 months ago
- A cross-lingual COVID-19 fake news dataset☆14Oct 14, 2021Updated 4 years ago
- ☆1,047Jul 9, 2025Updated 9 months ago
- ☆60Aug 18, 2021Updated 4 years ago
- ☆482Jul 8, 2025Updated 9 months ago
- Code for Analyzing Redundancy in Pretrained Transformer Models accepted at EMNLP 2020☆14Oct 6, 2020Updated 5 years ago
- pytorch版基于gpt+nezha的中文多轮Cdial☆11Oct 22, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Tool to parse wiki tables from the HTML dump of Wikipedia☆11Jun 12, 2022Updated 3 years ago
- API client for fetching and comparing passages from legislation☆14Jan 26, 2025Updated last year
- S2ORC: The Semantic Scholar Open Research Corpus: https://www.aclweb.org/anthology/2020.acl-main.447/☆1,038Apr 26, 2024Updated last year
- The corresponding code for our paper: "Exploring the Challenges of Open Domain Multi-Document Summarization". Do not hesitate to open an …☆33Jun 24, 2023Updated 2 years ago
- A curated list of resources for Document Understanding (DU) topic☆1,508Jun 2, 2023Updated 2 years ago
- Data/Code Repository for https://api.semanticscholar.org/CorpusID:218470122☆140Jul 25, 2024Updated last year
- Download client for legal opinions☆13Jan 26, 2025Updated last year
- Algorithms, papers, datasets, performance comparisons for Document AI.☆206Mar 1, 2025Updated last year
- ☆15Jun 16, 2021Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- SPECTER: Document-level Representation Learning using Citation-informed Transformers☆575Jun 12, 2023Updated 2 years ago
- A machine learning tool for fishing entities☆268Feb 27, 2026Updated last month
- ☆14Aug 3, 2022Updated 3 years ago
- Unifew: Unified Fewshot Learning Model☆18Sep 10, 2021Updated 4 years ago
- ↔️ Utilizing RBERT model structure for KLUE Relation Extraction task☆15Nov 15, 2022Updated 3 years ago
- A Python library aimed at dissecting and augmenting NER training data.☆61May 11, 2023Updated 2 years ago
- Experiment on metadata extraction using large language models such as GPT-3☆12Feb 1, 2023Updated 3 years ago