wzlxjtu / PDF2LaTeX-datasetLinks
☆21Updated 4 years ago
Alternatives and similar repositories for PDF2LaTeX-dataset
Users that are interested in PDF2LaTeX-dataset are comparing it to the libraries listed below
Sorting:
- Solution to im2latex request for research of openai☆90Updated last year
- Scanning Single Shot Detector for Math in Document Images☆130Updated 2 years ago
- Official implementation for ICDAR 2021 best poster paper "Handwritten Mathematical Expression Recognition with Bidirectionally Trained Tr…☆124Updated last year
- TDF-ICDAR 2019 Dataset for Typeset Math Formula Detection☆69Updated 5 years ago
- A Benchmark of PDF Information Extraction Tools using a Multi-Task and Multi-Domain Evaluation Framework for Academic Documents☆25Updated 2 years ago
- Converts from AsciiMath, LaTeX, MathML to LaTeX, MathML☆56Updated 5 years ago
- A GPT-based generative LM for combined text and math formulas, leveraging tree-based formula encoding.☆39Updated last year
- Image to LaTeX pytorch model☆14Updated last year
- ReadingBank: A Benchmark Dataset for Reading Order Detection☆105Updated 9 months ago
- Python tools for creating suitable dataset for OpenAI's im2latex task: https://openai.com/requests-for-research/#im2latex☆139Updated 6 years ago
- Python and JS tools to generate Printed LaTex formulas and images☆16Updated last year
- DocBankLoader is a dataset loader for DocBank, and can convert DocBank to the Object Detection models' format.☆24Updated 4 years ago
- Math formula recognition (Images to LaTeX strings)☆302Updated last year
- GTDB dataset for training & evaluation for mathematical OCR systems☆28Updated 4 years ago
- A neural network capable of translating handwriting into text along with complex tools to generate datasets☆20Updated 5 years ago
- Handwritten mathematical symbols recognition with TrOCR☆18Updated last year
- Math-aware QA system☆19Updated 2 years ago
- multimodal document analysis☆164Updated last year
- Codebase for fine-tuning / evaluating nougat-based image2latex generation models☆152Updated 8 months ago
- Chinese Mathematical Formula Detection (MFD) Dataset 中文文档 数学公式检测数据集☆34Updated 2 years ago
- Code for ICPR2022 paper: "Graph Neural Networks and Representation Embedding for table extraction in PDF Documents"☆35Updated last year
- transformer based OCR framework used to train OCR or image to latex☆9Updated 2 years ago
- ☆9Updated 5 years ago
- Data and Code for ACL 2021 Paper "Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning"☆151Updated 2 months ago
- Incorporating VIsual LAyout Structures for Scientific Text Classification☆178Updated 2 years ago
- ☆71Updated 3 years ago
- Question Answering dataset generator of Document Visual in English and Chinese☆24Updated 2 years ago
- DocILE: Document Information Localization and Extraction Benchmark☆129Updated last year
- We identify the desiderata for a comprehensive benchmark and propose Visually Rich Document Understanding (VRDU). VRDU contains two datas…☆80Updated 2 years ago
- CodeAssist is an advanced code completion tool that provides high-quality code completions for Python, Java, C++ and so on. CodeAssist 是一…☆58Updated last year