opensemanticsearch / open-semantic-search
Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, …
☆978Updated last year
Related projects ⓘ
Alternatives and complementary repositories for open-semantic-search
- Python based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & N…☆262Updated 2 years ago
- Carrot2: Text Clustering Algorithms and Applications☆774Updated last month
- The software used to extract structured data from Wikipedia☆858Updated 3 months ago
- Just the facts -- web page content extraction☆1,254Updated 4 months ago
- Language, Knowledge, Cognition☆585Updated 3 weeks ago
- INCEpTION provides a semantic annotation platform offering intelligent annotation assistance and knowledge management.☆600Updated this week
- A cross-platform command line tool for parallelised content extraction and analysis.☆241Updated 2 months ago
- Python/Django based webapps and web user interfaces for search, structure (meta data management like thesaurus, ontologies, annotations a…☆95Updated 2 years ago
- Open Source REST API for named entity extraction, named entity linking, named entity disambiguation, recommendation & reconciliation of e…☆180Updated 2 years ago
- A list of selected resources, methods, and tools dedicated to legal data schemes and ontologies.☆91Updated 7 months ago
- A self-hosted search engine for documents.☆598Updated this week
- Websites crawler with built-in exploration and control web interface☆328Updated 2 months ago
- LexNLP by LexPredict☆703Updated 5 months ago
- ACHE is a web crawler for domain-specific search.☆454Updated last year
- Annif is a multi-algorithm automated subject indexing tool for libraries, archives and museums.☆203Updated last week
- A machine learning tool for fishing entities☆248Updated last week
- Run Overview on your own system☆123Updated 3 years ago
- Textricator is a tool to extract text from documents and generate structured data.☆346Updated 3 weeks ago
- YAGO is a large semantic knowledge base, derived from Wikipedia, WordNet, WikiData, GeoNames, and other data sources☆729Updated 2 years ago
- Federated Knowledge Extraction Framework☆191Updated last year
- Ambar: Document Search Engine☆1,947Updated 3 years ago
- Information Integration Tool☆588Updated 7 months ago
- Search and browse documents and data; find the people and companies you look for.☆2,036Updated this week
- Wandora is a general purpose information extraction, management and publishing application based on Topic Maps and Java.☆129Updated last year
- A web-based document annotation tool, powered by GPT-4☆251Updated 10 months ago
- A search interface and wayback machine for the UKWA Solr based warc-indexer framework.☆102Updated last week
- Streaming WARC/ARC library for fast web archive IO☆386Updated last week
- The webprotege code base☆629Updated 8 months ago
- Virtuoso is a high-performance and scalable Multi-Model RDBMS, Data Integration Middleware, Linked Data Deployment, and HTTP Application …☆868Updated last week
- Fact Extraction from Wikipedia Text☆531Updated 8 years ago