tdunning / feature-extractionLinks
Sample techniques for a variety of feature extraction methods
☆31Updated 4 years ago
Alternatives and similar repositories for feature-extraction
Users that are interested in feature-extraction are comparing it to the libraries listed below
Sorting:
- ☆16Updated 7 years ago
- Supporting content (slides and exercises) for the Addison-Wesley (Pearson) video series covering best practices for developing scalable S…☆67Updated 9 years ago
- Workshop on Target Leakage in Machine Learning I taught at ODSC Europe 2018 (London) and ODSC East 2019, 2020 (Boston)☆37Updated 5 years ago
- A Scalable Data Cleaning Library for PySpark.☆27Updated 6 years ago
- Workshop for Spark and Databricks☆54Updated 5 years ago
- pyspark sample scripts☆17Updated 6 years ago
- notebooks for nlp-on-spark☆13Updated 8 years ago
- Some work on Kaggle data for fun☆64Updated 7 years ago
- ☆16Updated 2 years ago
- AWS Big Data Certification☆25Updated 4 months ago
- Apache Spark (Scala, PySpark, SparkR) Code, Tricks, and References☆69Updated 6 years ago
- Code for my presentation: Using PySpark to Process Boat Loads of Data☆20Updated 7 years ago
- ☆25Updated 6 years ago
- Large-scale Graph Mining with Spark☆40Updated 6 years ago
- Forecasting Uber demand in NYC neighborhoods☆34Updated 7 years ago
- Data Exploration in PySpark made easy - Pyspark_dist_explore provides methods to get fast insights in your Spark DataFrames.☆103Updated 5 years ago
- ☆48Updated last year
- Predict whether a student will correctly answer a problem based on past performance using automated feature engineering☆32Updated 4 years ago
- Binding the GDELT universe in a Spark environment☆24Updated 2 years ago
- Guide for applying Unit Testing in data-driven projects☆19Updated 5 years ago
- ☆11Updated 6 years ago
- Using Luigi to create a Machine Learning Pipeline using the Rossman Sales data from Kaggle☆33Updated 8 years ago
- Data Scientist code test☆19Updated 4 years ago
- Materials for Apache Arrow workshop at VLDB 2019☆42Updated 4 years ago
- Business Data Analysis by HiPIC of CalStateLA☆20Updated 6 years ago
- Mastering Spark for Data Science, published by Packt☆47Updated 2 years ago
- In-class exercises for Deep Learning course at NYC Data Science Academy☆32Updated 7 years ago
- introduction class to recommendation systems☆22Updated 5 years ago
- MLinProduction SageMaker workshop hosted in April 2020☆15Updated 5 years ago
- ☆155Updated 4 years ago