opensemanticsearch / open-semantic-search
Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, …
☆998Updated last year
Alternatives and similar repositories for open-semantic-search:
Users that are interested in open-semantic-search are comparing it to the libraries listed below
- Python based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & N…☆266Updated 2 years ago
- INCEpTION provides a semantic annotation platform offering intelligent annotation assistance and knowledge management.☆609Updated this week
- Language, Knowledge, Cognition☆587Updated this week
- LexNLP by LexPredict☆710Updated 8 months ago
- Annif is a multi-algorithm automated subject indexing tool for libraries, archives and museums.☆213Updated this week
- Carrot2: Text Clustering Algorithms and Applications☆793Updated 4 months ago
- PDF to XML ALTO file converter☆224Updated this week
- A web-based document annotation tool, powered by GPT-4☆258Updated last year
- Software that makes labeling PDFs easy.☆405Updated 9 months ago
- 🏭 PDF text extraction pipeline: self-hosted, local-first, Docker-based☆311Updated last year
- The software used to extract structured data from Wikipedia☆887Updated this week
- A machine learning tool for fishing entities☆258Updated this week
- Lightweight web scraping toolkit for documents and structured data.☆311Updated last year
- a Deep Learning Framework for Text https://delft.readthedocs.io/☆392Updated last week
- Science Parse parses scientific papers (in PDF form) and returns them in structured form.☆644Updated 8 months ago
- 🆕 Work continues on INCEpTION 👉 https://github.com/inception-project/inception 👈 -- ⚠️ The official WebAnno repository has reached the…☆244Updated 2 years ago
- Federated Knowledge Extraction Framework☆191Updated last year
- Open Source REST API for named entity extraction, named entity linking, named entity disambiguation, recommendation & reconciliation of e…☆193Updated 2 years ago
- Just the facts -- web page content extraction☆1,258Updated 7 months ago
- Data model and processing tools for investigative entity data☆224Updated this week
- A multilingual, cross-domain temporal tagger developed at the Database Systems Research Group at Heidelberg University.☆343Updated last year
- SpikeX - SpaCy Pipes for Knowledge Extraction☆397Updated 3 years ago
- Terrier IR Platform☆257Updated last month
- A self-hosted search engine for documents.☆612Updated this week
- NLP, before and after spaCy☆2,215Updated last year
- Heuristic based boilerplate removal tool☆747Updated 9 months ago
- A free tool to OCR a PDF and add a text "layer" in the original file, making a searchable PDF. Use only open source tools. Please tip!☆280Updated last year
- Websites crawler with built-in exploration and control web interface☆340Updated 3 weeks ago
- YAGO is a large semantic knowledge base, derived from Wikipedia, WordNet, WikiData, GeoNames, and other data sources☆732Updated 2 years ago
- A machine learning software for extracting information from scholarly documents☆3,795Updated this week