Repository for Apache Spark course at Team Data Science
☆17Oct 23, 2020Updated 5 years ago
Alternatives and similar repositories for learning-apache-spark
Users that are interested in learning-apache-spark are comparing it to the libraries listed below
Sorting:
- Repository for the Document streaming capstone projects☆12Nov 17, 2025Updated 4 months ago
- ☆15Jul 1, 2021Updated 4 years ago
- All important Python tools a Data Engineer needs☆28Jun 4, 2024Updated last year
- Build a Content-Based Movie Recommender System (TF-IDF, BM25, BERT)☆13Jun 13, 2022Updated 3 years ago
- Sample Project to Learn Data Engineering☆10Aug 1, 2021Updated 4 years ago
- Web App Development Made Simple with Streamlit, published by Packt☆32Feb 1, 2024Updated 2 years ago
- Dockerizing and Consuming an Apache Livy environment☆13Jun 29, 2022Updated 3 years ago
- Dockerizing a Python Script for Web Scraping and consume the scraped data using FastApi (www.metroscubicos.com)☆15Dec 16, 2021Updated 4 years ago
- ☆17Nov 12, 2022Updated 3 years ago
- Data sets and ML models versioning example from DVC get started☆10Jun 4, 2024Updated last year
- Code Repository for GCP: Complete Google Data Engineer and Cloud Architect Guide(v), Published by Packt☆16Jan 30, 2023Updated 3 years ago
- Challenge Data Engineer☆25Jun 13, 2022Updated 3 years ago
- My notes on natural history, science, and technology.☆17Dec 21, 2025Updated 3 months ago
- Labs and demos for courses in the Data Engineer track of GCP Training (http://cloud.google.com/training).☆16Oct 28, 2019Updated 6 years ago
- Optimal probabilistic planning of the transmission network development with the consideration of wind resource uncertainty☆11Jun 1, 2019Updated 6 years ago
- A Pyspark job to handle upserts, conversion to parquet and create partitions on S3☆28Jul 23, 2020Updated 5 years ago
- This is a simple Python library for interacting with the REST interface of an instance of Cordra☆10May 20, 2022Updated 3 years ago
- Simulator for cellular automata defined on regular lattices on Minkovski plane☆11Aug 10, 2021Updated 4 years ago
- ☆17Mar 12, 2023Updated 3 years ago
- Using Geopandas to Plot Brazil Maps☆17Oct 18, 2018Updated 7 years ago
- Starter Code for BNR React Testing Workshop☆12Apr 18, 2023Updated 2 years ago
- A self-contained, queryable knowledge graph of tech skills and IT stuff; maintained with git☆18Nov 14, 2023Updated 2 years ago
- ☆12Apr 21, 2021Updated 4 years ago
- Curated list of industry data science blogs☆13Dec 21, 2016Updated 9 years ago
- This repository contains the data and the code associated to the paper "Hyper-cores promote localization and efficient seeding in higher-…☆12Oct 6, 2023Updated 2 years ago
- Implementing RAG with Amazon Bedrock, Amazon Titan, and Amazon OpenSearch Serverless☆11Oct 9, 2023Updated 2 years ago
- 🔌 Flask S3Viewer is a powerful extension that makes it easy to browse S3 in any Flask application. (Python S3 Uploader / Flask S3 Upload …☆14Jan 8, 2025Updated last year
- Visually query Spanner Graph data in notebooks☆40Sep 18, 2025Updated 6 months ago
- Instruction tuning dataset generation inspired by LLaVA-Instruct-158k via any LLM, also for commercial use.☆13Mar 13, 2024Updated 2 years ago
- Annual and in-season crop mapping in Kenya☆25Jun 8, 2021Updated 4 years ago
- Tutorial for building a POC Kafka + Spark + Cassandra pipeline using Scala☆32Apr 13, 2020Updated 5 years ago
- Automating Your Data Pipeline with Apache Airflow☆40Sep 1, 2023Updated 2 years ago
- Simple Android App showing movement (delta frames) in images using Camera2 API using RenderScript.☆24Nov 13, 2019Updated 6 years ago
- This is a repository for the LinkedIn Learning course Practical Python for Data Professionals☆49Jun 12, 2024Updated last year
- Engineer streaming processing data pipeline on Azure with the main purpose to ingest and process tweets and satellite images data from Hu…☆23Apr 8, 2021Updated 4 years ago
- This is a NBD server for OpenStack Object Storage (Swift)☆31Mar 31, 2016Updated 9 years ago
- Processing source code for an animation☆10Jan 28, 2022Updated 4 years ago
- 🌎 🖥 Supercharge your scraper to extract quality page metadata by parsing JSON-LD data via Python's extruct library.☆15Mar 5, 2026Updated 2 weeks ago
- TDWG website☆16Updated this week