PySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2
☆89Jan 3, 2020Updated 6 years ago
Alternatives and similar repositories for pyspark-algorithms
Users that are interested in pyspark-algorithms are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian☆231Jun 26, 2023Updated 2 years ago
- Big Data Modeling, MapReduce, Spark, PySpark @ Santa Clara University☆165Dec 4, 2025Updated 5 months ago
- PySpark for ETL jobs including lineage to Apache Atlas in one script via code inspection☆17Jan 12, 2017Updated 9 years ago
- Code examples on Apache Spark using python☆108Aug 11, 2022Updated 3 years ago
- library for conducting propensity matching on spark scale☆14Jun 27, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- GlotEval: a unified evaluation toolkit designed to benchmark multilingual Large Language Models (LLMs) in a language-specific way☆18Nov 4, 2025Updated 6 months ago
- ☆18Nov 9, 2025Updated 6 months ago
- Updated repository☆156Nov 25, 2021Updated 4 years ago
- Collection of Databricks and Jupyter Notebooks☆22Feb 9, 2026Updated 3 months ago
- The 6 most window functions in PySpark - based on my blog post☆12Dec 15, 2023Updated 2 years ago
- My finite volume method project. Here I will implement the many pieces of a finite volume method to incorporate into a larger code.☆11Aug 15, 2019Updated 6 years ago
- Lightweight Streamlit app to test out metrics functionality in dbt☆10Feb 22, 2022Updated 4 years ago
- Following "Pure React" book☆11Dec 1, 2017Updated 8 years ago
- Local Development of AWS Glue with Docker and Visual Studio Code☆14Nov 29, 2021Updated 4 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Python programming practices on InterviewBit☆22Sep 29, 2020Updated 5 years ago
- ☆10Nov 7, 2022Updated 3 years ago
- SSM框架构建商城+论坛☆15Jun 30, 2018Updated 7 years ago
- ☆14Sep 14, 2021Updated 4 years ago
- Azure Cosmos DB - Custom Point in Time Restore☆12Dec 7, 2022Updated 3 years ago
- Code base for the Learning PySpark book (in preparation)☆631Apr 16, 2019Updated 7 years ago
- Counting Tweets Per User in Real-Time☆43Jul 28, 2017Updated 8 years ago
- Distributed stock price forecasting system to predict S&P 500 stock prices.☆11Nov 12, 2021Updated 4 years ago
- personal repo for https://github.com/EbookFoundation/free-programming-books☆18Oct 28, 2021Updated 4 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- ecommerce GCP Streaming pipeline ― Cloud Storage, Compute Engine, Pub/Sub, Dataflow, Apache Beam, BigQuery and Tableau; GCP Batch pipelin…☆11Mar 9, 2022Updated 4 years ago
- ☆10Feb 18, 2021Updated 5 years ago
- Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark☆1,536Dec 2, 2024Updated last year
- spark MLlib机器学习实践源码☆10Oct 28, 2016Updated 9 years ago
- An (unofficial) command line interface for Google APIs☆31May 22, 2023Updated 3 years ago
- Data transformation☆23Apr 18, 2021Updated 5 years ago
- Library to write readable and reproducible data processing code using python.☆23May 26, 2023Updated 3 years ago
- ☆10Jun 5, 2021Updated 4 years ago
- SuperTango8, Grapher by Algoritmegruppen☆14Jul 15, 2017Updated 8 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Clojure lens library, implements a small subset of ekmett's lens☆12Nov 29, 2018Updated 7 years ago
- ☆12Feb 20, 2020Updated 6 years ago
- Framework to make bots based on Microsoft Bot Framework.☆13Oct 5, 2018Updated 7 years ago
- A fast and low memory requirement version of PointHop and PointHop++, which is built upon Apache Spark.☆10Jul 14, 2020Updated 5 years ago
- Highly interactive, thread-parallel Lattice Boltzmann CFD solver☆21Apr 29, 2019Updated 7 years ago
- Converts OData query strings to DocumentDB SQL statements☆13Dec 23, 2024Updated last year
- A project for the development of rich geospatial data from the city of São Paulo for use in Machine Learning models.☆11Jul 4, 2021Updated 4 years ago