jnweiger / pdfcompare
compare two PDF files, write a resulting PDF with highlighted changes
☆54Updated last month
Related projects: ⓘ
- LocalCopy is a plugin that extends the popular reference manager JabRef. It provides an automatic download feature for preprints from the…☆28Updated 12 years ago
- PDF Extraction Toolkit☆41Updated 3 years ago
- Tool that does layout analysis and/or text recognition using tesseract and outputs the result in Page XML format☆44Updated 5 months ago
- An expandable and scalable OCR pipeline☆86Updated 6 years ago
- Compare documents using MS Word from the command line.☆123Updated 8 months ago
- ☆37Updated 8 years ago
- The hOCR Embedded OCR Workflow and Output Format☆72Updated last month
- PDF to XML ALTO file converter☆209Updated this week
- OCR for DjVu☆44Updated last year
- BibSync is a tool to synchronize scientific papers and bibtex bibliography files☆59Updated 10 years ago
- PDF Table Extractor - repository to hold revisable version of code from https://www.cvast.tuwien.ac.at/projects/pdf2table by Burcu Yildiz☆38Updated 6 months ago
- Extracts highlighted text from PDF documents.☆31Updated 6 years ago
- MathWebSearch Implementation☆46Updated last year
- A toolkit for clustering web pages based on various similarity measures.☆32Updated 2 years ago
- Multi-Entity Extraction Framework for Academic Documents (with default extraction tools)☆29Updated 11 months ago
- BMC (BiblioManagementClient) is a simple script to download and store your articles.☆16Updated 8 years ago
- MOVED TO https://gitlab.com/crossref/pdfmark☆33Updated 5 years ago
- pdf2xml convertor based on Xpdf library - modified version☆27Updated 6 years ago
- smoothscan is a tool to convert scanned text into a vectorized output form.☆67Updated 10 years ago
- List of tools for dealing with the wonderful PDF format.☆46Updated 3 years ago
- A web based data mining workflow platform with real-time analysis capabilities☆48Updated last year
- Microsoft (MS) EMF to SVG conversion library☆95Updated last month
- An efficient data structure for fast string similarity searches☆23Updated 3 years ago
- copy of pdftohtml code with enhancements☆25Updated 10 months ago
- LightSide Workbench☆24Updated 11 months ago
- Lexer and codec to work with LaTeX code in Python. Instead of using latexcodec, I encourage you to consider pylatexenc instead, which is …☆27Updated 5 months ago
- Ergonomic line-by-line transcription of scanned text.☆47Updated 3 years ago
- Scripts and results from our OCR roundup, available on Source☆150Updated 5 years ago
- Get semantic HTML from PDFs, recover lost text, tables, data... in bulk.☆28Updated 9 months ago
- Nodejs implementation of pandoc filter to turn TeX math into embedded SVG☆21Updated last year