Assembly of fundamental statistics implemented based on Apache Spark
☆31Feb 11, 2016Updated 10 years ago
Alternatives and similar repositories for StatisticsOnSpark
Users that are interested in StatisticsOnSpark are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Spark MLlib code optimized to efficiently support sparse data☆51Dec 22, 2016Updated 9 years ago
- Topic Modeling on Apache Spark☆94Mar 1, 2019Updated 7 years ago
- Yelp Restaurant Photo Classification - Kaggle competition☆12Apr 19, 2019Updated 7 years ago
- Winter Break Collaboratory DS Boot Camp during the academic year of 2017-2018☆14Feb 12, 2018Updated 8 years ago
- Step-by-step Deep Leaning Tutorials on Apache Spark using BigDL☆210Jan 3, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Gradient Boosting Enhanced with Step-Wise Feature Augmentation☆17Jan 13, 2021Updated 5 years ago
- Implementation of End-To-End Memory Networks with Tensorflow for bAbI Dataset☆11Aug 17, 2017Updated 8 years ago
- ☆14Aug 24, 2021Updated 4 years ago
- Machine learning evaluation database☆24Feb 7, 2018Updated 8 years ago
- Another, hopefully better, implementation of ALS on Spark☆14May 20, 2015Updated 11 years ago
- Tensorflow implementation of a Neural Attention Model for Abstractive Summarization.☆10Jul 20, 2020Updated 5 years ago
- Cascading and Scalding wrapper for HBase with advanced read features☆54Feb 11, 2020Updated 6 years ago
- Дипломная работа бакалавра / Bachelor thesis☆10Sep 11, 2015Updated 10 years ago
- Pattern-of-Behavior Search Tool☆11Jun 20, 2022Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Spatial error estimation and variable importance☆20Jan 30, 2025Updated last year
- Dropwizard Metrics reporter for Apache Spark☆28Dec 22, 2014Updated 11 years ago
- ☆62Jul 11, 2019Updated 6 years ago
- Yahoo!'s topic modelling framework using Latent Dirichlet Allocation☆98Sep 21, 2011Updated 14 years ago
- JVM related exercises☆11Jul 16, 2017Updated 8 years ago
- Timeseries segmentation library☆12Mar 8, 2023Updated 3 years ago
- Spark Implementation of BIRCH Clustering algorithm☆13Feb 18, 2020Updated 6 years ago
- ☆20Nov 16, 2014Updated 11 years ago
- Memory consumption estimator for Scala/Java☆27Nov 24, 2014Updated 11 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Материалы лекций☆18Jan 2, 2013Updated 13 years ago
- MaRe leverages the power of Docker and Spark to run and scale your serial tools in MapReduce fashion.☆14Apr 12, 2022Updated 4 years ago
- A focused web crawler based on Playwright, RMQ, Kafka and Flink.☆14Feb 4, 2021Updated 5 years ago
- 《Redis 命令速查表》☆14Nov 6, 2017Updated 8 years ago
- AugBoost: Gradient Boosting Enhanced with Step-Wise Feature Augmentation (2019 IJCAI paper)☆23Oct 22, 2019Updated 6 years ago
- Manual for RStudio Server☆16Oct 5, 2013Updated 12 years ago
- Simple UDF to split JSON arrays into Hive arrays☆10Jun 24, 2016Updated 9 years ago
- Cassandra river for Elastic search.☆37Jul 15, 2013Updated 12 years ago
- Python scripts to facilitate easy working☆11Mar 23, 2026Updated last month
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- An introduction to Hockey Visualization with D3.js☆15Mar 27, 2018Updated 8 years ago
- This project is for the notebooks, code, and data for the "Vocabulary Analysis of Job Descriptions" tutorial at PyData 2017 Seattle☆20Jul 12, 2017Updated 8 years ago
- ☆24Jul 2, 2015Updated 10 years ago
- Group project for the WorldQuant University module, risk management.☆13Feb 3, 2019Updated 7 years ago
- Code for Springer Book: High Performance Distributed Computing: Case Studies with Hadoop, Scalding and Spark☆15Oct 6, 2017Updated 8 years ago
- Using LANDSAT7 satellite images to predict population.☆12Jun 10, 2016Updated 9 years ago
- Modeling Tanimoto distributions for RDKit☆18Feb 28, 2020Updated 6 years ago