pawelrychlik / duplitectorLinks
A duplicate data detector engine PoC based on Elasticsearch.
☆20Updated 10 years ago
Alternatives and similar repositories for duplitector
Users that are interested in duplitector are comparing it to the libraries listed below
Sorting:
- Analyze standard numbers like ARK, DOI, EAN, GTIN, IBAN, ISAN, ISBN, ISMN, ISNI, ISSN, ISTC, ISWC, ORCID, PPN, SICI, UPC, ZDB with Elasti…☆24Updated 9 years ago
- Google Drive river for Elasticsearch☆20Updated 11 years ago
- A command line and Python client for Open-Spending☆10Updated 8 years ago
- ScaleGraph is an X10 billion scale graph analysis library.☆21Updated 10 years ago
- common data interchange format for document processing pipelines that apply natural language processing tools to large streams of text☆35Updated 9 years ago
- a pure javascript frontend for ElasticSearch search indices.☆80Updated 7 years ago
- Práctica del Workshop de NLP #NodeConfAR2017☆10Updated 8 years ago
- Term List Matching Plugin for ElasticSearch☆26Updated 12 years ago
- FacetView is a pure javascript frontend for ElasticSearch.☆291Updated 10 years ago
- memcached transport plugin for elasticsearch (STOPPED)☆34Updated 2 years ago
- Baseform lemmatization for Elasticsearch☆26Updated 6 years ago
- ☆21Updated 12 years ago
- Python interface for OrientDB binary Serialization☆10Updated 5 years ago
- A stemmer for Slovak language☆12Updated 8 years ago
- An extension to the demo template of ElasticUI a beautiful AngularJS frontend to ElasticSearch for faceted navigation☆39Updated 10 years ago
- Analysis plugin for ElasticSearch providing capability for processing inline annotations in documents.☆35Updated 12 years ago
- An Elasticsearch plugin that enables you to keep only the N latest indices.☆18Updated 11 years ago
- A semantic analysis tool to generate synonym.txt files for Solr. [RETIRED]☆25Updated 9 years ago
- Raw Wikipedia counts for entity linking☆19Updated 8 years ago
- The first Open Source document analysis platform☆65Updated 4 years ago
- The GitHub repository for the Copenhagen Dependency Treebanks exported from Google Code. The repository is still in the process of being …☆11Updated 5 years ago
- An attempt at creating a gold standard dataset for backtesting yesterday & today's content-extractors☆35Updated 10 years ago
- Brand disambiguator for tweets to differentiate e.g. Orange vs orange (brand vs foodstuff), using NLTK and scikit-learn☆58Updated 12 years ago
- A bundle of useful Elasticsearch plugins☆112Updated last year
- UberSocialNet—applying the Lambda Architecture☆30Updated 12 years ago
- Docker container to provide Apache Tika RESTful API☆41Updated 9 years ago
- Custom graph algorithms for Neo4j with own Java and REST APIs☆35Updated 9 years ago
- A Relaxed Schema Graph Database Management System☆52Updated 5 years ago
- A command line utility for generating thumbnails, resizing images, and uploading images to Amazon S3☆10Updated 10 years ago
- SKOS analysis for Elasticsearch☆54Updated 9 years ago