enahwe / Csv2Hive
Csv2Hive is an useful CSV schema finder for the Big Data. It discovers automatically schemas in big CSV files, generates the 'CREATE TABLE' statements and creates Hive tables. You don't need to writes any schemas at all. Csv2Hive is a really fast solution for integrating the whole CSV files into your DataLake.
☆27Updated 7 years ago
Related projects ⓘ
Alternatives and complementary repositories for Csv2Hive
- PySpark for Elastic Search☆55Updated 7 years ago
- Some notebook examples related to Apache Spark, IPython / Jupyter, Zeppelin☆52Updated 8 years ago
- Example project showing how to use Hive UDFs in Apache Spark☆55Updated 5 years ago
- Additional useful algorithms that can be used with spark.☆24Updated 9 years ago
- Training materials for Strata, AMP Camp, etc☆150Updated 8 years ago
- A light Kafka to HDFS/S3 ETL library based on Apache Spark☆41Updated 7 years ago
- Machine Learning Pipeline Stages for Spark (exposed in Scala/Java + Python)☆74Updated last year
- Beyond Piwik Analytics with Scala and Apache Spark☆45Updated 9 years ago
- A real time streaming implementation of markov chain based fraud detection☆24Updated 9 years ago
- This project is for examples of how to use Zeppelin. https://github.com/apache/incubator-zeppelin☆25Updated 8 years ago
- Zeppelin notebook examples☆26Updated 8 years ago
- Factorization Machines on Spark and Glint☆25Updated 8 years ago
- An Apache Spark-shell backend for IPython☆105Updated 3 years ago
- An external PySpark module that works like R's read.csv or Panda's read_csv, with automatic type inference and null value handling. Parse…☆90Updated 8 years ago
- This project provides association rule mining for Apache Spark. The algorithms are based on the work of Philippe Fournier-Viger and comp…☆31Updated 9 years ago
- Example of use of Spark Streaming with Kafka☆90Updated 10 years ago
- A Spark Streaming job reading events from Amazon Kinesis and writing event counts to DynamoDB☆94Updated 4 years ago
- Complete Pipeline Training at Big Data Scala By the Bay☆71Updated 9 years ago
- Experiments with Ooyala's Spark Job Server☆21Updated 9 years ago
- This is an introduction of Apache Spark DataFrames.☆41Updated 9 years ago
- An example of using Avro and Parquet in Spark SQL☆60Updated 8 years ago
- ☆41Updated 7 years ago
- Supporting content (slides and exercises) for the Addison-Wesley (Pearson) video series covering best practices for developing scalable S…☆66Updated 8 years ago
- Vagrant projects for various use-cases with Spark, Zeppelin, IPython / Jupyter, SparkR☆34Updated 8 years ago
- This tutorial provides a quick introduction to using Spark☆57Updated 8 years ago
- ☆146Updated 8 years ago
- Elastic Search on Spark☆112Updated 10 years ago