This is the ETL lib package. It provides an API to munge and prepare JSON, TSV and other data using Apache Tika and JSON parsing/loading for ETL via Apache OODT (or other libs) into Apache Solr.
☆18Jan 27, 2024Updated 2 years ago
Alternatives and similar repositories for etllib
Users that are interested in etllib are comparing it to the libraries listed below
Sorting:
- A DropWizard wrapper around Apache Tika.☆10Dec 22, 2016Updated 9 years ago
- Tika-Similarity uses the Tika-Python package (Python port of Apache Tika) to compute file similarity based on Metadata features.☆108Apr 9, 2025Updated 10 months ago
- Uses Apache Lucene, OpenNLP and geonames and extracts locations from text and geocodes them.☆38Apr 9, 2024Updated last year
- This is a Fact based Question Answering System using Apache Solr as backend search engine, Wikipedia dumps as information source, Apache …☆26Jan 21, 2026Updated last month
- Multi-step AI agents powered by Gemini 2.0 and the LangGraph framework. These agents orchestrate complex workflows and enhance their reas…☆10Dec 19, 2024Updated last year
- Combines Apache OpenNLP and Apache Tika and provides facilities for automatically deriving sentiment from text.☆34May 3, 2023Updated 2 years ago
- A toolkit for clustering web pages based on various similarity measures.☆34Oct 27, 2021Updated 4 years ago
- Overcooked! 2 TAS Development Framework☆10Aug 18, 2023Updated 2 years ago