seahboonsiew / pyspark-csv

An external PySpark module that works like R's read.csv or Panda's read_csv, with automatic type inference and null value handling. Parses csv data into SchemaRDD. No installation required, simply include pyspark_csv.py via SparkContext.
90Updated 9 years ago

Alternatives and similar repositories for pyspark-csv:

Users that are interested in pyspark-csv are comparing it to the libraries listed below