wzlxjtu / PDF2LaTeX-dataset
☆21Updated 4 years ago
Alternatives and similar repositories for PDF2LaTeX-dataset:
Users that are interested in PDF2LaTeX-dataset are comparing it to the libraries listed below
- Train a neural network to produce latex source code which generates a given pdf file☆12Updated 7 years ago
- Scanning Single Shot Detector for Math in Document Images☆130Updated last year
- Math-aware QA system☆18Updated 2 years ago
- [ICML 2023] "Outline, Then Details: Syntactically Guided Coarse-To-Fine Code Generation", Wenqing Zheng, S P Sharan, Ajay Kumar Jaiswal, …☆40Updated last year
- Code for ICPR2022 paper: "Graph Neural Networks and Representation Embedding for table extraction in PDF Documents"☆35Updated last year
- Another LaTex equation OCR tool based on ConvNeXt and Transformer☆49Updated last year
- Solution to im2latex request for research of openai☆89Updated 11 months ago
- A GPT-based generative LM for combined text and math formulas, leveraging tree-based formula encoding.☆35Updated last year
- TDF-ICDAR 2019 Dataset for Typeset Math Formula Detection☆67Updated 5 years ago
- Python and JS tools to generate Printed LaTex formulas and images☆15Updated last year
- Converts from AsciiMath, LaTeX, MathML to LaTeX, MathML☆54Updated 5 years ago
- Image to LaTeX pytorch model☆14Updated last year
- Code for the paper "Learning to Prove Theorems by Learning to Generate Theorems"☆32Updated 4 years ago
- ☆147Updated 10 months ago
- ICDAR 2021 Competition on Scientific Literature Parsing☆34Updated 4 years ago
- DocBankLoader is a dataset loader for DocBank, and can convert DocBank to the Object Detection models' format.☆23Updated 4 years ago
- Training a reward model for RLHF using RWKV.☆14Updated last year
- ReadingBank: A Benchmark Dataset for Reading Order Detection☆104Updated 7 months ago
- A command line interface to download PDF files from https://arxiv.org.☆47Updated 11 months ago
- A turnkey command for converting a LaTeX source to ar5iv-style HTML☆63Updated last year
- ☆43Updated 2 years ago
- transformer based OCR framework used to train OCR or image to latex☆9Updated 2 years ago
- 1st Solution For ICDAR 2021 Competition on Mathematical Formula Detection(公式检测冠军方案)☆130Updated last year
- Formal representation and solving for Euclidean plane geometry problems.☆19Updated 2 months ago
- Discovering Mathematical Objects of Interest - A Study of Mathematical Notations☆10Updated 5 years ago
- A Benchmark of PDF Information Extraction Tools using a Multi-Task and Multi-Domain Evaluation Framework for Academic Documents☆23Updated 2 years ago
- WikiTableSet: A largest publicly available image-based table recognition dataset in three languages built from Wikipedia☆28Updated 2 years ago
- The multilingual variant of GLM, a general language model trained with autoregressive blank infilling objective☆62Updated 2 years ago
- ☆67Updated 3 years ago
- Scripts for downloading and pre-processing the `proof-pile`, a high quality dataset of mathematical text and code.☆19Updated 2 years ago