geyungjen / jentekllc
Apache Spark Application Development -- George Jen, Jen Tek LLC
☆15Updated last year
Related projects ⓘ
Alternatives and complementary repositories for jentekllc
- How to do data science with Optimus, Spark and Python.☆18Updated 5 years ago
- ☆16Updated last year
- Openscoring application for the Docker distributed applications platform☆10Updated 4 years ago
- PySpark phonetic and string matching algorithms☆35Updated 8 months ago
- ☆19Updated 3 years ago
- Spark NLP for Streamlit☆15Updated 3 years ago
- Splittable SAS (.sas7bdat) Input Format for Hadoop and Spark SQL☆90Updated last year
- Productivity Utilities for Data Science with Python Notebooks☆5Updated 4 years ago
- ☆14Updated 6 years ago
- H2OAI Driverless AI Code Samples and Tutorials☆37Updated 2 weeks ago
- Instant search for and access to many datasets in Pyspark.☆34Updated 2 years ago
- Hands-On Data Analysis with Scala, published by Packt☆19Updated last year
- Converting a zeppelin notebook in single programming language to respective script☆18Updated 4 years ago
- Spark and Python (PySpark) Examples☆39Updated 3 years ago
- Materials for Apache Arrow workshop at VLDB 2019☆42Updated 4 years ago
- An example PySpark project with pytest☆17Updated 7 years ago
- ☆11Updated 6 years ago
- Mastering Spark for Data Science, published by Packt☆46Updated last year
- Powerful rapid automatic EDA and feature engineering library with a very easy to use API 🌟☆53Updated 2 years ago
- [ARCHIVED] Moved to github.com/NVIDIA/spark-xgboost-examples☆70Updated 4 years ago
- Analyzing NBA data using Spark 2.1☆46Updated 7 years ago
- ☆15Updated 2 years ago
- Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on sing…☆44Updated 2 weeks ago
- Study notes and demos.☆12Updated 8 months ago
- Demonstration code for MLeap, both Jupyter notebooks and projects☆24Updated 5 years ago
- A repository for a PySpark Cookbook by Tomasz Drabas and Denny Lee☆60Updated 6 years ago
- Tools for faster and optimized interaction with Teradata and large datasets.☆17Updated 6 years ago
- ☆16Updated 6 years ago
- ☆25Updated 6 years ago