Machine Learning and Data Analysis Case Studies using Spark.
☆72Mar 22, 2021Updated 5 years ago
Alternatives and similar repositories for Data-Science-with-Spark
Users that are interested in Data-Science-with-Spark are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Analyzing and calculating key marketing metrics with SQL and Python☆14Feb 24, 2019Updated 7 years ago
- Statistical Hypothesis Testing with the Pingouin Python Library.☆11Aug 25, 2022Updated 3 years ago
- Repository used for Spark Trainings☆54Apr 21, 2023Updated 3 years ago
- ☆17Mar 18, 2018Updated 8 years ago
- A command-line batch interface to the RuleFit statistical model building program.☆20Jan 30, 2017Updated 9 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Extracting LinkedIn comments from any post and export it to Excel file☆23Oct 17, 2018Updated 7 years ago
- Lista de enlaces a datasets relacionados con Colombia☆28Mar 17, 2016Updated 10 years ago
- Minimal example to setup a Jenkins-CI pipeline for data science projects on OpenShift in a couple of minutes.☆27Jan 7, 2025Updated last year
- Personalization with deep learning in 100 lines of code☆15Mar 31, 2023Updated 3 years ago
- Repo for the Deep Reinforcement Learning Nanodegree program☆12Jun 12, 2023Updated 3 years ago
- Different machine learning algorithms implementation in Tensorflow☆27Dec 8, 2016Updated 9 years ago
- DataHack Challenges - Challenges offered during our hackathon by top data companies.☆12Jan 28, 2020Updated 6 years ago
- This repository focuses on saving my linkedin articles and stuff that I find "USEFUL" on LinkedIn.☆156Jan 18, 2023Updated 3 years ago
- Big Data's open seminars: An Interactive Introduction to Reinforcement Learning☆14Nov 21, 2017Updated 8 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- ☆100Jun 25, 2018Updated 8 years ago
- Our solution to the data science hackathon by McKinsey, Prohack by our team D1D, which was ranked 4th on public leaderboard and 25th on p…☆10Jun 21, 2020Updated 6 years ago
- 重构论文A Biterm Topic Model for Short Texts提供的源代码,编译成一个python 扩展模块,并用python 包装了一下,提供一个user-friendly python package☆11Apr 15, 2019Updated 7 years ago
- Huemul BigDataGovernance, es una framework que trabaja sobre Spark, Hive y HDFS. Permite la implementación de una estrategia corporativa …☆11Apr 21, 2023Updated 3 years ago
- Fast code to learn deep neural networks in R.☆37Apr 1, 2016Updated 10 years ago
- A curated list of practical resources for Agentic Engineering☆84Updated this week
- Retail and Logistics Logs for 1.2.0☆13Jul 22, 2025Updated 11 months ago
- List of interesting links about ML Algorithms, Data Science, Network Analysis, and others.☆13May 9, 2023Updated 3 years ago
- Datasets and notebooks☆13Oct 26, 2016Updated 9 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Reddit Data Science Project Ideas☆11Dec 28, 2019Updated 6 years ago
- Solutions to Crack Coding Interview in Python☆14Mar 8, 2015Updated 11 years ago
- PyTorch implementation of "Weight Uncertainties in Neural Networks" (Bayes-by-Backprop)☆15Sep 10, 2018Updated 7 years ago
- DISC is a behavior assessment tool based on the DISC theory of psychologist William Moulton Marston☆16Aug 15, 2018Updated 7 years ago
- Machine Learning Notebooks with Turicreate and Keras in a Docker Container☆19Apr 26, 2019Updated 7 years ago
- Interactive Elasticsearch Analyzer☆13Dec 8, 2022Updated 3 years ago
- Fast Python Collaborative Filtering for Implicit Datasets☆15Oct 17, 2016Updated 9 years ago
- A comprehensive guide to applying statistical techniques in machine learning, including data preprocessing, model development, evaluation…☆28Jan 29, 2025Updated last year
- Tools for performing hyperparameter search with Scikit-Learn and Dask http://dask-searchcv.readthedocs.io☆11Nov 16, 2017Updated 8 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- aigc evals☆10Dec 2, 2023Updated 2 years ago
- ☆18Apr 25, 2023Updated 3 years ago
- A tutorial for using Hadoop with Python and Hive☆10May 26, 2015Updated 11 years ago
- Galvanize DSI Capstone: Subreddit Recommender☆15Jan 15, 2019Updated 7 years ago
- Kaadugal is a parallelized multi-core C++ implementation of the random forests algorithm for classification, regression, and structured p…☆23Oct 19, 2017Updated 8 years ago
- A repository with different graph processing tehnologies☆11Nov 30, 2015Updated 10 years ago
- Tensorflow implementation of Bayes-by-Backprop algorithm from "Weight uncertainty in neural networks" paper☆14Mar 6, 2019Updated 7 years ago