chrismattmann / etllib
This is the ETL lib package. It provides an API to munge and prepare JSON, TSV and other data using Apache Tika and JSON parsing/loading for ETL via Apache OODT (or other libs) into Apache Solr.
☆16Updated 9 months ago
Related projects ⓘ
Alternatives and complementary repositories for etllib
- Uses Apache Lucene, OpenNLP and geonames and extracts locations from text and geocodes them.☆36Updated 7 months ago
- An HTTP proxy for Elasticsearch, Solr (etc.) to prevent a 100% full disk situation.☆11Updated 6 years ago
- Advanced desktop search/corpus exploration prototype☆21Updated 3 years ago
- Efficient indexing and retrieval of OCR bounding boxes in Solr☆22Updated 5 years ago
- Vizlinc☆14Updated 8 years ago
- The OpenSextant Gazetteer is a collection of world-wide place name data☆12Updated 6 years ago
- The DPLA Platform☆64Updated 6 years ago
- All that entity matching, resolution, normalization, enhancement and reconciliation madness, but with a focus on data, not platforms.☆24Updated 2 years ago
- An experiment in visualizing your Solr index via term counts, document counts, and memory usage per field and data type.☆15Updated 9 years ago
- SKOS Support for Apache Lucene and Solr☆56Updated 3 years ago
- BatchRefine adds batch processing capabilities to OpenRefine☆50Updated 7 years ago
- LINKED DATA QUALITY REPORTS☆41Updated 2 years ago
- Files for the Karma tutorial at TCDL, Texas Conference on Digital Libraries☆28Updated 8 years ago
- Elwha is a Java application for monitoring topics, sentiment and events on Twitter streams with the ability to generate notification mess…☆14Updated 9 years ago
- Combines Apache OpenNLP and Apache Tika and provides facilities for automatically deriving sentiment from text.☆32Updated last year
- For interacting with nutch via Python☆23Updated 3 weeks ago
- Linked Data explorer and SPARQL endpoint☆23Updated 2 years ago
- Simple search results with Solr and EmberJS☆58Updated 5 years ago
- Search a single field with different query time analyzers in Solr☆25Updated 4 years ago
- Big GeoSpatial Data Points Visualization Tool☆19Updated 8 years ago
- Mirror of Apache Stanbol (incubating)☆112Updated 8 months ago
- UnifiedViews☆30Updated 2 years ago
- The DBpedia DataID vocabulary is a metadata system for detailed descriptions of datasets and their physical instances, as well as their r…☆35Updated last year
- 💠 + 📚 OpenRefine on Binder!☆40Updated 5 months ago
- sparql-stream sensor queries☆16Updated 8 years ago
- A toolkit for clustering web pages based on various similarity measures.☆32Updated 3 years ago
- A design prototype for DocNow to learn with☆14Updated 7 years ago
- A framework to allow the matching of string entities using customised sets of transformations and matchers, plus a tool to produce the ne…☆30Updated 7 years ago
- Shared curriculum, files, notes, presentations, assignments, etc. for the CLIR DLF Eresearch Network☆9Updated 6 years ago
- Solrstrap is a Query-Result interface for Solr written in JavaScript, HTML and CSS☆86Updated 7 years ago