opensemanticsearch / open-semantic-searchLinks
Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, …
☆1,111Updated 8 months ago
Alternatives and similar repositories for open-semantic-search
Users that are interested in open-semantic-search are comparing it to the libraries listed below
Sorting:
- Python based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & N…☆276Updated 3 years ago
- Carrot2: Text Clustering Algorithms and Applications☆839Updated last month
- 🏭 PDF text extraction pipeline: self-hosted, local-first, Docker-based☆329Updated 2 years ago
- Websites crawler with built-in exploration and control web interface☆368Updated 3 months ago
- A list of memex-related tools and their repository URLs☆155Updated 7 years ago
- ACHE is a web crawler for domain-specific search.☆475Updated 3 months ago
- Textricator is a tool to extract text from documents and generate structured data.☆350Updated 9 months ago
- A self‑hosted search engine for documents☆679Updated last week
- Python/Django based webapps and web user interfaces for search, structure (meta data management like thesaurus, ontologies, annotations a…☆99Updated 3 years ago
- Language, Knowledge, Cognition☆624Updated 4 months ago
- LexNLP by LexPredict☆757Updated last year
- Ambar: Document Search Engine☆1,951Updated 4 years ago
- Core Python Web Archiving Toolkit for replay and recording of web archives☆1,593Updated 3 weeks ago
- A fork of Dragnet that also extract author, headline, date, keywords from context, as well as built in metadata extraction all in one pac…☆297Updated 7 months ago
- Index Common Crawl archives in tabular format☆124Updated this week
- Data model and processing tools for investigative entity data☆258Updated 2 weeks ago
- Open Semantic Visual Linked Data Graph Explorer: Open Source tool (web app) and user interace (UI) for discovery, exploration and visuali…☆89Updated 5 years ago
- An open database of international sanctions data, persons of interest and politically exposed persons☆653Updated this week
- ☆114Updated last week
- PDF to XML ALTO file converter☆258Updated last month
- Open Source REST API for named entity extraction, named entity linking, named entity disambiguation, recommendation & reconciliation of e…☆196Updated 3 years ago
- brozzler - distributed browser-based web crawler☆765Updated this week
- Information Integration Tool☆604Updated 8 months ago
- Heuristic based boilerplate removal tool☆809Updated 10 months ago
- The software used to extract structured data from Wikipedia☆916Updated this week
- Software that makes labeling PDFs easy.☆423Updated last year
- The low-code Knowledge Graph application platform. Apache license.☆584Updated this week
- INCEpTION provides a semantic annotation platform offering intelligent annotation assistance and knowledge management.☆670Updated this week
- Obsei is a low code AI powered automation tool. It can be used in various business flows like social listening, AI based alerting, brand …☆1,368Updated last month
- Elasticsearch File System Crawler (FS Crawler)☆1,419Updated this week