pawelrychlik / duplitectorLinks
A duplicate data detector engine PoC based on Elasticsearch.
☆20Updated 10 years ago
Alternatives and similar repositories for duplitector
Users that are interested in duplitector are comparing it to the libraries listed below
Sorting:
- A command line and Python client for Open-Spending☆10Updated 7 years ago
- Google Drive river for Elasticsearch☆20Updated 10 years ago
- common data interchange format for document processing pipelines that apply natural language processing tools to large streams of text☆35Updated 8 years ago
- ScaleGraph is an X10 billion scale graph analysis library.☆21Updated 9 years ago
- Baseform lemmatization for Elasticsearch☆26Updated 6 years ago
- Python interface for OrientDB binary Serialization☆10Updated 5 years ago
- Term List Matching Plugin for ElasticSearch☆26Updated 11 years ago
- A bundle of useful Elasticsearch plugins☆111Updated last year
- Hunspell analysis for ElasticSearch☆38Updated 13 years ago
- Crime Doesn't Climb in San Francisco☆100Updated 11 years ago
- Focused Crawler for VT's CTRNet☆10Updated 12 years ago
- Analysis plugin for ElasticSearch providing capability for processing inline annotations in documents.☆35Updated 11 years ago
- Simple Hungarian Sentence Analysis with NLTK☆16Updated 4 years ago
- A stemmer for Slovak language☆12Updated 8 years ago
- Full text extraction using the Open Source Tesseract OCR software https://code.google.com/p/tesseract-ocr/ and imagemagick☆12Updated 10 years ago
- Python natural language processing work☆29Updated 15 years ago
- A Python implementation of causal inference of pathways using Gibbs sample approach☆10Updated 11 years ago
- Analyze standard numbers like ARK, DOI, EAN, GTIN, IBAN, ISAN, ISBN, ISMN, ISNI, ISSN, ISTC, ISWC, ORCID, PPN, SICI, UPC, ZDB with Elasti…☆24Updated 9 years ago
- Rich browser-based frontend for elasticsearch☆102Updated 10 years ago
- Like Facebook's OSQuery, but for Postgres☆449Updated 9 years ago
- memcached transport plugin for elasticsearch (STOPPED)☆34Updated 2 years ago
- Discover, analyze and present data from the web and mobile in meaninful ways☆82Updated 12 years ago
- An elasticsearch site plugin for identifying risky IPs or subnets in web logs☆46Updated 9 years ago
- a pure javascript frontend for ElasticSearch search indices.☆80Updated 7 years ago
- Docker container to provide Apache Tika RESTful API☆41Updated 9 years ago
- Import GeoNames.org data into a SQLite database for full-text search and autocomplete☆35Updated 6 years ago
- FacetView is a pure javascript frontend for ElasticSearch.☆291Updated 10 years ago
- Shave pages off of PDFs as images☆59Updated 7 years ago
- Web frontend for Myria☆11Updated 4 years ago
- A set of components designed to retrieve data from third-party APIs and storage systems, and to pass that data in to a DataSift account.☆9Updated 7 years ago