measuresforjustice / textricatorLinks
Textricator is a tool to extract text from documents and generate structured data.
☆346Updated 4 months ago
Alternatives and similar repositories for textricator
Users that are interested in textricator are comparing it to the libraries listed below
Sorting:
- Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Tex…☆1,053Updated 2 months ago
- Python based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & N…☆269Updated 2 years ago
- A cross-platform command line tool for parallelised content extraction and analysis.☆245Updated last week
- web interface for recoll desktop search☆288Updated 4 years ago
- Ergonomic line-by-line transcription of scanned text.☆53Updated 4 years ago
- 🏭 PDF text extraction pipeline: self-hosted, local-first, Docker-based☆323Updated last year
- A set of tools to allow PDF to XML conversion, utilising Apache Beam and other tools. The aim of this project is to bring multiple tools…☆294Updated 3 years ago
- PDF to XML ALTO file converter☆246Updated last week
- Data Curator - share usable open data☆274Updated 3 years ago
- Run Overview on your own system☆124Updated 3 years ago
- A free tool to OCR a PDF and add a text "layer" in the original file, making a searchable PDF. Use only open source tools. Please tip!☆291Updated last month
- A framework for creating web-based knowledge maps☆204Updated last week
- ☆100Updated this week
- Extract tables from PDF pages.☆293Updated 5 years ago
- Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)☆190Updated last month
- Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.☆395Updated 11 months ago
- Extract tables from PDF files☆357Updated 9 years ago
- ☆210Updated 4 years ago
- ☆98Updated 4 years ago
- A simple OpenRefine reconciliation service that runs on top of a CSV file☆121Updated 10 years ago
- Fidus Writer is an online collaborative editor for academics.☆546Updated last month
- A range of tools to help you get more out of NVivo(tm)☆53Updated 3 years ago
- A self-hosted search engine for documents. Fill our user survey about structured content: : https://forms.gle/PYgusFsoBaMyzUec9