wzlxjtu / PDF2LaTeX-datasetLinks
☆21Updated 5 years ago
Alternatives and similar repositories for PDF2LaTeX-dataset
Users that are interested in PDF2LaTeX-dataset are comparing it to the libraries listed below
Sorting:
- Train a neural network to produce latex source code which generates a given pdf file☆13Updated 8 years ago
- Another LaTex equation OCR tool based on ConvNeXt and Transformer☆51Updated 2 years ago
- Scanning Single Shot Detector for Math in Document Images☆132Updated 2 years ago
- Codebase for fine-tuning / evaluating nougat-based image2latex generation models☆160Updated last year
- TDF-ICDAR 2019 Dataset for Typeset Math Formula Detection☆69Updated 5 years ago
- A neural network capable of translating handwriting into text along with complex tools to generate datasets☆19Updated 5 years ago
- Solution to im2latex request for research of openai☆89Updated last year
- LaTeX OCR 的数据仓库☆132Updated last year
- Large scale training of Latex formula recognition model, currently being organized and open source☆54Updated last year
- Math formula recognition (Images to LaTeX strings)☆306Updated 2 years ago
- ReadingBank: A Benchmark Dataset for Reading Order Detection☆112Updated last year
- Code for ICPR2022 paper: "Graph Neural Networks and Representation Embedding for table extraction in PDF Documents"☆35Updated 2 years ago
- Python tools for creating suitable dataset for OpenAI's im2latex task: https://openai.com/requests-for-research/#im2latex☆141Updated 6 years ago
- The Soft Cosine Measure system developed for the ARQMath-3 shared task evaluation of math information retrieval systems☆13Updated 3 years ago
- Official implementation for ICDAR 2021 best poster paper "Handwritten Mathematical Expression Recognition with Bidirectionally Trained Tr…☆125Updated last year
- A tool for extracting arbitrary tables from untagged PDF documents☆40Updated 4 years ago
- ☆45Updated 3 years ago
- multimodal document analysis☆166Updated last year
- DocLayNet: A Large Human-Annotated Dataset for Document-Layout Analysis☆395Updated 2 years ago
- A GPT-based generative LM for combined text and math formulas, leveraging tree-based formula encoding. Published as "Tree-Based Represent…☆40Updated 2 years ago
- DIAR software for synthetic document image and groundtruth generation, with various degradation models for data augmentation☆128Updated last year
- 1st Solution For ICDAR 2021 Competition on Mathematical Formula Detection(公式检测冠军方案)☆132Updated 2 years ago
- Incorporating VIsual LAyout Structures for Scientific Text Classification☆179Updated 2 years ago
- ICDAR 2021 Competition on Scientific Literature Parsing☆35Updated 5 years ago
- Convert hand written mathematical expressions and formula to Latext using Machine Learning☆75Updated 7 years ago
- Converts from AsciiMath, LaTeX, MathML to LaTeX, MathML☆59Updated 5 years ago
- Logical structure analysis for visually structured documents☆92Updated 3 years ago
- Dataset of PNG images from synthetically generated table layouts with annotations in JSONL files☆152Updated last month
- Fully automated end-to-end framework to extract data from bar plots and other figures in scientific research papers using modules such as…☆121Updated 4 years ago
- Data used for LSTM model training☆123Updated last year