dsidavis / pdftohtml
copy of pdftohtml code with enhancements
☆25Updated 10 months ago
Related projects: ⓘ
- PDF Extraction Toolkit☆41Updated 3 years ago
- Extract citations from PDFs.☆28Updated 10 years ago
- Structured Data from PDF image-based files☆87Updated 11 years ago
- Zorba - the NoSQL processor☆42Updated 9 months ago
- Multi-Entity Extraction Framework for Academic Documents (with default extraction tools)☆29Updated 11 months ago
- Edit Textbooks using Javascript and save to GitHub☆102Updated 6 years ago
- MOVED TO https://gitlab.com/crossref/pdfmark☆33Updated 5 years ago
- Tool that does layout analysis and/or text recognition using tesseract and outputs the result in Page XML format☆44Updated 5 months ago
- A web application to create and edit EPUBs, written in CakePHP.☆17Updated 9 years ago
- PolyTeX to LaTeX and HTML☆48Updated 4 months ago
- Scripts and Howtos about using different CSL (Citation Style Language) files with pandoc☆35Updated 5 years ago
- Drawing tree structures with SVG and JavaScript☆34Updated 9 years ago
- Docvert for Python3: Converts Office files to DocBook and clean HTML, diagrams to SVG/PNG, etc.☆35Updated 8 years ago
- ☆19Updated 6 years ago
- PDF Table Extractor - repository to hold revisable version of code from https://www.cvast.tuwien.ac.at/projects/pdf2table by Burcu Yildiz☆38Updated 6 months ago
- PDBF - A Toolkit for Creating Janiform Data Documents☆49Updated 8 years ago
- BibJSON spec and website☆19Updated 9 years ago
- ☆17Updated 9 years ago
- A visual note taking app made with React☆22Updated 7 years ago
- Super-project that aggregates all Pipeline related code, provides a common tracker for Pipeline related issues and holds the Pipeline web…☆20Updated this week
- CAD-Data of Libreflip☆18Updated 5 years ago
- Read natural language interactive queries. Great for bots.☆18Updated 7 years ago
- The CIS OCR PostCorrectionTool☆39Updated last year
- User contributed (non Google) OCR models for Tesseract☆19Updated last year
- A small Docker built for the OCRopus OCR system.☆19Updated 6 years ago
- U.S. Code Complexity☆22Updated 11 years ago
- A Lua custom writer for Pandoc generating JATS XML☆74Updated 6 years ago
- A PDF collection reader with built-in full-text search engine☆19Updated 7 years ago
- A simple viewer and inspection tool for text boxes in PDF documents☆91Updated 2 years ago
- A fast way to convert rasterized straight edges into vectors.☆0Updated 8 months ago