paperai / pdfannoLinks
Linguistic Annotation and Visualization Tool for PDF Documents
β199Updated 5 years ago
Alternatives and similar repositories for pdfanno
Users that are interested in pdfanno are comparing it to the libraries listed below
Sorting:
- a Deep Learning Framework for Text https://delft.readthedocs.io/β399Updated 2 weeks ago
- π Work continues on INCEpTION π https://github.com/inception-project/inception π -- β οΈ The official WebAnno repository has reached theβ¦β246Updated 2 years ago
- Anafora is a web-based raw text annotation toolβ243Updated 2 years ago
- GROBID extension for identifying and normalizing physical quantities.β82Updated last week
- A project about benchmarking and evaluating existing PDF extraction tools on their semantic abilities to extract the body texts from PDF β¦β68Updated 4 years ago
- Toolbox for OCR post-correctionβ121Updated 5 years ago
- Neuralized version of the Reference String Parser component of the ParsCit package.β81Updated 3 years ago
- High-level build project for all LAPDF-Text submodulesβ103Updated 9 years ago
- Command line tool to extract figures, tables, and captions from scholarly documents in PDF form.β130Updated 7 years ago
- Python library for Natural Language Preprocessing (NLPre)β191Updated last year
- Software that makes labeling PDFs easy.β415Updated last year
- PDF to XML ALTO file converterβ244Updated 2 weeks ago
- FoLiA Linguistic Annotation Tool -- Flat is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.gβ¦β112Updated 5 months ago
- A Named-Entity Recogniser based on Grobid.β53Updated last month
- Excel Integration with spaCy. Training NER using Excel/XLSX from PDF, DOCX, PPT, PNG or JPG.β105Updated 2 years ago
- A tool for visualizing trees, tailored specifically to the analysis of parse trees.β81Updated 4 years ago
- Table Extraction Toolβ90Updated 7 years ago
- A Python implementation of the SimString, a simple and efficient algorithm for approximate string matching.β123Updated last year
- A way to do annotations for NER. TALEN: Tool for Annotation of Low-resource ENtitiesβ115Updated 3 years ago
- A set of tools to allow PDF to XML conversion, utilising Apache Beam and other tools. The aim of this project is to bring multiple toolsβ¦β294Updated 3 years ago
- A machine learning tool for fishing entitiesβ264Updated last month
- A modular annotation system that supports complex, interactive annotation graphs embedded on top of sequences of text.β95Updated 3 years ago
- A more complete example of programming with PDFMiner, which continues where the default documentation stopsβ214Updated 5 years ago
- Corpus of Open Access articles from multiple fields in Science, Technology, and Medicine.β73Updated 8 years ago
- CoNLL-U format library for JavaScriptβ72Updated 8 years ago
- Character Based Named Entity Recognition.β40Updated 7 years ago
- Word Embeddings for Information Retrievalβ225Updated last year
- PDF parser and converter to HTMLβ85Updated 8 months ago
- Get annotation suggestions for the INCEpTION text annotation platform from spaCy, Sentence BERT, scikit-learn and more. Runs as a web-serβ¦β46Updated 8 months ago
- Python library that classifies content from scientific papers with the topics of the Computer Science Ontology (CSO).β90Updated 6 months ago