datamindedbe / python-and-spark-for-data-analysis
A four-day course on Python, the Scientific Python stack and PySpark, adapted from a training course given by Patrick Varilly to one of our clients in December 2015
☆11Updated 9 years ago
Alternatives and similar repositories for python-and-spark-for-data-analysis:
Users that are interested in python-and-spark-for-data-analysis are comparing it to the libraries listed below
- A four-day course on Python, the Scientific Python stack and PySpark, adapted from a training course I gave to one of our clients in Dece…☆10Updated 9 years ago
- Ingest tweets with Kafka. Use Spark to track popular hashtags and trendsetters for each hashtag☆29Updated 9 years ago
- Partly lecture and partly a hands-on tutorial and workshop, this is a three part series on how to get started with MLflow. In this four p…☆39Updated 4 years ago
- AWS Big Data Certification☆25Updated 3 months ago
- Spark NLP for Streamlit☆15Updated 3 years ago
- Code for my presentation: Using PySpark to Process Boat Loads of Data☆20Updated 7 years ago
- Build end-to-end Machine Learning pipeline to predict accessibility of playgrounds in NYC☆15Updated 4 years ago
- Contains source files used in the Spark with Python course☆18Updated 6 years ago
- Creating a Streaming Pipeline for user log data in Google Cloud Platform☆22Updated 5 years ago
- ☆17Updated 6 years ago
- ☆16Updated last year
- ☆37Updated 8 years ago
- Live Twitter sentiment analysis using Python, Apache Spark Streaming, Kafka, NLTK, SocketIO☆20Updated 7 years ago
- A simple introduction to using spark ml pipelines☆26Updated 7 years ago
- Instant search for and access to many datasets in Pyspark.☆34Updated 2 years ago
- A project template for developing BYOD docker images for use in Amazon SageMaker.☆19Updated 5 years ago
- Code examples for the Introduction to Kubeflow course☆14Updated 4 years ago
- Repository used for Spark Trainings☆53Updated 2 years ago
- ☆23Updated 5 years ago
- End to end Machine Learning with Amazon SageMaker☆42Updated last year
- Projects developed by Domino's R&D team☆76Updated 3 years ago
- Quickstart PySpark with Anaconda on AWS/EMR using Terraform☆47Updated 3 months ago
- Data models, build data warehouses and data lakes, automate data pipelines, and worked with massive datasets.☆13Updated 5 years ago
- notebooks for nlp-on-spark☆13Updated 8 years ago
- Code demonstrating a simple Machine Learning model abstract base class and its uses.☆14Updated last year
- CentOS based Docker container for Time Series Analysis and Modeling.☆21Updated 5 years ago
- A code-based tutorial for production level data streaming with PySpark plus Optimus for data cleaning, Confluent Kafka, & Apache Drill u…☆26Updated 5 years ago
- Follow the Lumiata Tech Blog on Medium!☆21Updated last year
- Example of orchestrating dependent Databricks jobs using Airflow☆11Updated 5 years ago
- Data Science Boot-Camp : UC San DiegoX☆17Updated last year