multimodal document analysis
☆166May 14, 2026Updated last month
Alternatives and similar repositories for mmda
Users that are interested in mmda are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Incorporating VIsual LAyout Structures for Scientific Text Classification☆180Mar 18, 2023Updated 3 years ago
- SMASHED is a toolkit designed to apply transformations to samples in datasets, such as fields extraction, tokenization, prompting, batchi…☆35May 24, 2024Updated 2 years ago
- S2APLER: S2 Agglomeration of Papers with Low Error Rate (it's for academic paper clustering)☆22May 15, 2026Updated last month
- Implementation of DocFormer: End-to-End Transformer for Document Understanding, a multi-modal transformer based architecture for the task…☆289Feb 13, 2023Updated 3 years ago
- Software that makes labeling PDFs easy.☆429May 13, 2024Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Benchmark dataset for the evaluation of scientific article representations on the task of citation recommendation across various scientif…☆12Oct 21, 2022Updated 3 years ago
- Implementation of the paper: Going Full-TILT Boogie on Document Understanding with Text-Image-Layout Transformer.☆18Apr 23, 2023Updated 3 years ago
- Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understan…☆366Oct 31, 2022Updated 3 years ago
- ☆34Jan 2, 2024Updated 2 years ago
- code for participation in ICDAR2021 Table Recognition track (Team Name: LTIAYN = Kaen Context)☆22Jun 16, 2021Updated 4 years ago
- Index of URLs to pdf files all over the internet and scripts☆24May 2, 2023Updated 3 years ago
- library supporting NLP and CV research on scientific papers☆797Nov 8, 2024Updated last year
- DocBank: A Benchmark Dataset for Document Layout Analysis☆646Aug 12, 2024Updated last year
- Parsers for scientific papers (PDF2JSON, TEX2JSON, JATS2JSON)☆469Apr 11, 2024Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- A Benchmark of PDF Information Extraction Tools using a Multi-Task and Multi-Domain Evaluation Framework for Academic Documents☆32Dec 8, 2022Updated 3 years ago
- A Unified Toolkit for Deep Learning Based Document Image Analysis☆5,739Aug 15, 2024Updated last year
- ☆18Oct 22, 2022Updated 3 years ago
- Japanese / English Bilingual LLM☆30Dec 23, 2025Updated 5 months ago
- ☆1,047Jul 9, 2025Updated 11 months ago
- ☆61Aug 18, 2021Updated 4 years ago
- ☆483Jul 8, 2025Updated 11 months ago
- Code for Analyzing Redundancy in Pretrained Transformer Models accepted at EMNLP 2020☆14Oct 6, 2020Updated 5 years ago
- Tool to parse wiki tables from the HTML dump of Wikipedia☆11Jun 12, 2022Updated 4 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- S2ORC: The Semantic Scholar Open Research Corpus: https://www.aclweb.org/anthology/2020.acl-main.447/☆1,062Apr 26, 2024Updated 2 years ago
- The corresponding code for our paper: "Exploring the Challenges of Open Domain Multi-Document Summarization". Do not hesitate to open an …☆33Jun 24, 2023Updated 2 years ago
- A curated list of resources for Document Understanding (DU) topic☆1,521Jun 2, 2023Updated 3 years ago
- Data/Code Repository for https://api.semanticscholar.org/CorpusID:218470122☆140Jul 25, 2024Updated last year
- Download client for legal opinions☆13Updated this week
- Ukrainian ELECTRA model☆12Mar 11, 2023Updated 3 years ago
- Algorithms, papers, datasets, performance comparisons for Document AI.☆209Mar 1, 2025Updated last year
- ☆15Jun 16, 2021Updated 4 years ago
- SPECTER: Document-level Representation Learning using Citation-informed Transformers☆582Jun 12, 2023Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A machine learning tool for fishing entities☆268Feb 27, 2026Updated 3 months ago
- ☆14Aug 3, 2022Updated 3 years ago
- Unifew: Unified Fewshot Learning Model☆18Sep 10, 2021Updated 4 years ago
- ↔️ Utilizing RBERT model structure for KLUE Relation Extraction task☆15Nov 15, 2022Updated 3 years ago
- Experiment on metadata extraction using large language models such as GPT-3☆12Feb 1, 2023Updated 3 years ago
- This repository contains code used for our Multi Sentence Inference NAACL'22 paper.☆12Mar 6, 2023Updated 3 years ago
- A simple library for segmenting legal texts☆18Apr 22, 2023Updated 3 years ago