chrismattmann / etllibLinks
This is the ETL lib package. It provides an API to munge and prepare JSON, TSV and other data using Apache Tika and JSON parsing/loading for ETL via Apache OODT (or other libs) into Apache Solr.
☆17Updated last year
Alternatives and similar repositories for etllib
Users that are interested in etllib are comparing it to the libraries listed below
Sorting:
- All that entity matching, resolution, normalization, enhancement and reconciliation madness, but with a focus on data, not platforms.☆24Updated 3 years ago
- Uses Apache Lucene, OpenNLP and geonames and extracts locations from text and geocodes them.☆37Updated last year
- BatchRefine adds batch processing capabilities to OpenRefine☆50Updated 8 years ago
- LINKED DATA QUALITY REPORTS☆41Updated 3 years ago
- A high-throughput ontology-based pipeline for data integration☆14Updated 2 years ago
- Linked Data explorer and SPARQL endpoint☆23Updated 3 years ago
- A framework to allow the matching of string entities using customised sets of transformations and matchers, plus a tool to produce the ne…☆34Updated 8 years ago
- An HTTP proxy for Elasticsearch, Solr (etc.) to prevent a 100% full disk situation.☆11Updated 6 years ago
- Demonstration of searching PDF document with Solr, Tika, and Tesseract☆31Updated 8 months ago
- Files for the Karma tutorial at TCDL, Texas Conference on Digital Libraries☆29Updated 9 years ago
- Automatic tagging and analysis of documents in an Apache Solr index for faceted search by RDF(S) Ontologies & SKOS thesauri☆47Updated 3 years ago
- sparql-stream sensor queries☆16Updated 8 years ago
- Self-Service Semantic Suite (S4)☆17Updated 8 years ago
- Combines Apache OpenNLP and Apache Tika and provides facilities for automatically deriving sentiment from text.☆34Updated 2 years ago
- The DBpedia DataID vocabulary is a metadata system for detailed descriptions of datasets and their physical instances, as well as their r…☆38Updated last year
- Tool to cleanse and semantify datasets from CKAN repositories. Based on OpenRefine.☆23Updated 9 years ago
- PLOS Subject Area Thesaurus☆40Updated 8 months ago
- Execute OpenRefine JSON scripts without OpenRefine (or Java)☆30Updated 2 years ago
- Shared curriculum, files, notes, presentations, assignments, etc. for the CLIR DLF Eresearch Network☆9Updated 7 years ago
- Google Refine extension for adding columns (extending data) from DBpedia☆39Updated 11 years ago
- The inpho model and dataprocessing tools. Interface between codex and inphosite☆18Updated 5 years ago
- Automatically exported from code.google.com/p/tdwg-rdf☆21Updated 5 years ago
- Java, Perl, Python, Javascript, Ruby, etc. examples to query alphasparql.bioontology.org☆34Updated 9 years ago
- Named-Entity Recognition extension for Google Refine / OpenRefine☆72Updated 8 years ago
- SKOS Support for Apache Lucene and Solr☆56Updated 4 years ago
- Loading OpenSanctions into Neo4J and Linkurious☆30Updated 6 months ago
- LOD-enabled version of OpenRefine. (This project is not actively maintained anymore)☆61Updated 5 years ago
- Efficient indexing and retrieval of OCR bounding boxes in Solr☆22Updated 6 years ago
- Mirror of Apache Stanbol (incubating)☆112Updated last year
- Advanced desktop search/corpus exploration prototype☆21Updated 4 years ago