chrismattmann / etllibLinks
This is the ETL lib package. It provides an API to munge and prepare JSON, TSV and other data using Apache Tika and JSON parsing/loading for ETL via Apache OODT (or other libs) into Apache Solr.
☆17Updated last year
Alternatives and similar repositories for etllib
Users that are interested in etllib are comparing it to the libraries listed below
Sorting:
- All that entity matching, resolution, normalization, enhancement and reconciliation madness, but with a focus on data, not platforms.☆24Updated 3 years ago
- An HTTP proxy for Elasticsearch, Solr (etc.) to prevent a 100% full disk situation.☆11Updated 6 years ago
- Uses Apache Lucene, OpenNLP and geonames and extracts locations from text and geocodes them.☆37Updated last year
- BatchRefine adds batch processing capabilities to OpenRefine☆50Updated 8 years ago
- Linked Data explorer and SPARQL endpoint☆23Updated 3 years ago
- The OpenSextant Gazetteer is a collection of world-wide place name data☆12Updated 7 years ago
- sparql-stream sensor queries☆16Updated 8 years ago
- A framework to allow the matching of string entities using customised sets of transformations and matchers, plus a tool to produce the ne…☆33Updated 8 years ago
- LINKED DATA QUALITY REPORTS☆41Updated 3 years ago
- Geographic Place, Date/time, and Pattern entity extraction toolkit along with text extraction from unstructured data and GIS outputters.☆44Updated 3 weeks ago
- Google Refine extension for adding columns (extending data) from DBpedia☆39Updated 11 years ago
- For interacting with nutch via Python☆29Updated last month
- Execute OpenRefine JSON scripts without OpenRefine (or Java)☆30Updated 2 years ago
- Automatic tagging and analysis of documents in an Apache Solr index for faceted search by RDF(S) Ontologies & SKOS thesauri☆47Updated 3 years ago
- The DPLA Platform☆64Updated 6 years ago
- Vizlinc☆15Updated 9 years ago
- Shared curriculum, files, notes, presentations, assignments, etc. for the CLIR DLF Eresearch Network☆9Updated 6 years ago
- 💠 + 📚 OpenRefine on Binder!☆41Updated 11 months ago
- Utilities for working with streaming XML pipelines☆13Updated 9 years ago
- Automatically exported from code.google.com/p/tdwg-rdf☆21Updated 5 years ago
- Efficient indexing and retrieval of OCR bounding boxes in Solr☆22Updated 6 years ago
- Advanced desktop search/corpus exploration prototype☆21Updated 3 years ago
- Combines Apache OpenNLP and Apache Tika and provides facilities for automatically deriving sentiment from text.☆34Updated 2 years ago
- SKOS Support for Apache Lucene and Solr☆56Updated 4 years ago
- Files for the Karma tutorial at TCDL, Texas Conference on Digital Libraries☆29Updated 9 years ago
- Elwha is a Java application for monitoring topics, sentiment and events on Twitter streams with the ability to generate notification mess…☆16Updated 9 years ago
- See https://github.com/tworavens/tworavens for current repository for this project and http://2ra.vn for project pages.☆30Updated 6 years ago
- This is a basic instance of the D-Net software toolkit, a software framework for the realization of aggregative data infrastructures.☆15Updated 3 years ago
- KnowledgeStore☆20Updated 7 years ago
- Named-Entity Recognition extension for Google Refine / OpenRefine☆72Updated 7 years ago