pawelrychlik / duplitectorLinks
A duplicate data detector engine PoC based on Elasticsearch.
☆20Updated 10 years ago
Alternatives and similar repositories for duplitector
Users that are interested in duplitector are comparing it to the libraries listed below
Sorting:
- common data interchange format for document processing pipelines that apply natural language processing tools to large streams of text☆35Updated 9 years ago
- Google Drive river for Elasticsearch☆20Updated 10 years ago
- FacetView is a pure javascript frontend for ElasticSearch.☆291Updated 10 years ago
- ScaleGraph is an X10 billion scale graph analysis library.☆21Updated 9 years ago
- A semantic analysis tool to generate synonym.txt files for Solr. [RETIRED]☆25Updated 9 years ago
- Term List Matching Plugin for ElasticSearch☆26Updated 11 years ago
- Python interface for OrientDB binary Serialization☆10Updated 5 years ago
- FreebaseAPI is a library to use the Freebase API (data mapper + low level API)☆43Updated 10 years ago
- Github mirror of "search/highlighter" - our actual code is hosted with Gerrit (please see https://www.mediawiki.org/wiki/Developer_access…☆103Updated last month
- Curiosity is a generic frontend for facetting, displaying and editing data from any elasticsearch index.☆75Updated 9 years ago
- memcached transport plugin for elasticsearch (STOPPED)☆34Updated 2 years ago
- An extension to the demo template of ElasticUI a beautiful AngularJS frontend to ElasticSearch for faceted navigation☆39Updated 10 years ago
- UberSocialNet—applying the Lambda Architecture☆30Updated 12 years ago
- Demo visualization of Neo4j data☆162Updated 8 years ago
- An attempt at creating a silver/gold standard dataset for backtesting yesterday & today's content-extractors☆35Updated 10 years ago
- The GitHub repository for the Copenhagen Dependency Treebanks exported from Google Code. The repository is still in the process of being …☆11Updated 5 years ago
- WordNet RDF export☆24Updated 8 years ago
- GraphAware Timer-Driven Runtime Module that executes PageRank-like algorithm on the graph☆26Updated 7 years ago
- Docker container to provide Apache Tika RESTful API☆41Updated 9 years ago
- Analysis plugin for ElasticSearch providing capability for processing inline annotations in documents.☆35Updated 11 years ago
- A bundle of useful Elasticsearch plugins☆112Updated last year
- Analyze standard numbers like ARK, DOI, EAN, GTIN, IBAN, ISAN, ISBN, ISMN, ISNI, ISSN, ISTC, ISWC, ORCID, PPN, SICI, UPC, ZDB with Elasti…☆24Updated 9 years ago
- Full text extraction using the Open Source Tesseract OCR software https://code.google.com/p/tesseract-ocr/ and imagemagick☆13Updated 10 years ago
- A Python implementation of causal inference of pathways using Gibbs sample approach☆10Updated 12 years ago
- a pure javascript frontend for ElasticSearch search indices.☆80Updated 7 years ago
- Focused Crawler for VT's CTRNet☆10Updated 12 years ago
- Verteego Data Suite☆10Updated 8 years ago
- Rich browser-based frontend for elasticsearch☆102Updated 10 years ago
- Baseform lemmatization for Elasticsearch☆26Updated 6 years ago
- Collects multimedia content shared through social networks.☆19Updated 10 years ago