opensemanticsearch / open-semantic-search
Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, …
☆1,025Updated last week
Alternatives and similar repositories for open-semantic-search:
Users that are interested in open-semantic-search are comparing it to the libraries listed below
- Python based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & N…☆268Updated 2 years ago
- Language, Knowledge, Cognition☆601Updated 2 months ago
- PDF to XML ALTO file converter☆237Updated last week
- YAGO is a large semantic knowledge base, derived from Wikipedia, WordNet, WikiData, GeoNames, and other data sources☆735Updated 2 years ago
- Python/Django based webapps and web user interfaces for search, structure (meta data management like thesaurus, ontologies, annotations a…☆97Updated 2 years ago
- LexNLP by LexPredict☆719Updated 10 months ago
- Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.☆389Updated 8 months ago
- The software used to extract structured data from Wikipedia☆892Updated 2 months ago
- Textricator is a tool to extract text from documents and generate structured data.☆347Updated last month
- NLP, before and after spaCy☆2,224Updated last year
- ACHE is a web crawler for domain-specific search.☆468Updated last year
- 🏖TagEditor - Annotation tool for spaCy☆192Updated 2 years ago
- A list of memex-related tools and their repository URLs☆149Updated 7 years ago
- An API to scrape American court websites for metadata.☆414Updated last week
- A spaCy pipeline and model for NLP on unstructured legal text.☆649Updated 9 months ago
- A machine learning tool for fishing entities☆264Updated 3 weeks ago
- A curated list of Knowledge Graph related learning materials, databases, tools and other resources☆1,607Updated 2 months ago
- Ambar: Document Search Engine☆1,950Updated 3 years ago
- LexPredict ContraxSuite☆168Updated 2 years ago
- NBoost is a scalable, search-api-boosting platform for deploying transformer models to improve the relevance of search results on differe…☆676Updated 4 years ago
- Carrot2: Text Clustering Algorithms and Applications☆800Updated last month
- Heuristic based boilerplate removal tool☆766Updated 2 months ago
- LexPredict Legal Dictionaries☆117Updated 2 years ago
- Elasticsearch with BERT for advanced document search.☆898Updated last year
- Open Semantic Visual Linked Data Graph Explorer: Open Source tool (web app) and user interace (UI) for discovery, exploration and visuali…☆84Updated 5 years ago
- A web interface to extract tabular data from PDFs☆1,651Updated 3 months ago
- A search interface and wayback machine for the UKWA Solr based warc-indexer framework.☆112Updated this week
- Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)☆188Updated 2 months ago
- OCR engine for all the languages☆814Updated 2 weeks ago
- Article extraction benchmark: dataset and evaluation scripts☆312Updated last year