enahwe / Csv2HiveLinks
Csv2Hive is an useful CSV schema finder for the Big Data. It discovers automatically schemas in big CSV files, generates the 'CREATE TABLE' statements and creates Hive tables. You don't need to writes any schemas at all. Csv2Hive is a really fast solution for integrating the whole CSV files into your DataLake.
☆27Updated 8 years ago
Alternatives and similar repositories for Csv2Hive
Users that are interested in Csv2Hive are comparing it to the libraries listed below
Sorting:
- PySpark for Elastic Search☆55Updated 8 years ago
- Some notebook examples related to Apache Spark, IPython / Jupyter, Zeppelin☆52Updated 9 years ago
- An external PySpark module that works like R's read.csv or Panda's read_csv, with automatic type inference and null value handling. Parse…☆90Updated 9 years ago
- Coding exercises for Apache Spark☆104Updated 10 years ago
- Visualize streaming machine learning in Spark☆177Updated 8 years ago
- Code reference from my Qbox blog posts.☆87Updated 10 years ago
- ☆146Updated 9 years ago
- Real-time Machine Learning with Apache Spark on Twitter Public Stream☆68Updated 8 years ago
- Oracle Data Science Bootcamp 2014☆25Updated 10 years ago
- Training materials for Strata, AMP Camp, etc☆149Updated 9 years ago
- Gallery of Apache Zeppelin notebooks☆216Updated 6 years ago
- This project combines Apache Spark and Elasticsearch to enable mining & prediction for Elasticsearch.☆211Updated 10 years ago
- Spark 2.0 Python Machine Learning examples☆97Updated 6 years ago
- Elastic Search on Spark☆112Updated 11 years ago
- Machine Learning Pipeline Stages for Spark (exposed in Scala/Java + Python)☆74Updated last year
- Spark Extension : ML transformers, SQL aggregations, etc that are missing in Apache Spark☆146Updated 9 years ago
- Beyond Piwik Analytics with Scala and Apache Spark☆46Updated 10 years ago
- A Spark Streaming job reading events from Amazon Kinesis and writing event counts to DynamoDB☆93Updated 5 years ago
- A simple tool for plotting Spark ML's Decision Trees☆40Updated 3 years ago
- Example project showing how to use Hive UDFs in Apache Spark☆55Updated 6 years ago
- PredictionIO Python SDK☆196Updated 7 years ago
- A platform for real-time streaming search☆102Updated 9 years ago
- An Apache Spark-shell backend for IPython☆105Updated 4 years ago
- Docker container for Kafka - Spark Streaming - Cassandra☆98Updated 6 years ago
- tutorials and samples that show you how get the most out of IBM Analytics for Apache Spark☆79Updated 7 years ago
- Hadoop, Spark and Storm based anomaly detection implementations for data quality, cyber security, fraud detection etc.☆127Updated last year
- ☆110Updated 8 years ago
- CustomerML is an open source customer science platform leveraging the power of Predictiveworks and fully integrated with Elasticsearch an…☆48Updated 10 years ago
- Spark GCE Script Helps you deploy Spark cluster on Google Cloud.☆43Updated 10 years ago
- A collection of examples using flinks new python API☆250Updated 6 months ago