seahboonsiew / pyspark-csv

An external PySpark module that works like R's read.csv or Panda's read_csv, with automatic type inference and null value handling. Parses csv data into SchemaRDD. No installation required, simply include pyspark_csv.py via SparkContext.
90Updated 9 years ago

Related projects

Alternatives and complementary repositories for pyspark-csv