Assembly of fundamental statistics implemented based on Apache Spark
☆31Feb 11, 2016Updated 10 years ago
Alternatives and similar repositories for StatisticsOnSpark
Users that are interested in StatisticsOnSpark are comparing it to the libraries listed below
Sorting:
- Spark MLlib code optimized to efficiently support sparse data☆51Dec 22, 2016Updated 9 years ago
- Topic Modeling on Apache Spark☆94Mar 1, 2019Updated 7 years ago
- Yelp Restaurant Photo Classification - Kaggle competition☆12Apr 19, 2019Updated 6 years ago
- Winter Break Collaboratory DS Boot Camp during the academic year of 2017-2018☆14Feb 12, 2018Updated 8 years ago
- Parallel ML System - Bosen Java implementation☆27Jan 23, 2017Updated 9 years ago
- ☆22Dec 9, 2015Updated 10 years ago
- Step-by-step Deep Leaning Tutorials on Apache Spark using BigDL☆210Jan 3, 2023Updated 3 years ago
- Gradient Boosting Enhanced with Step-Wise Feature Augmentation☆17Jan 13, 2021Updated 5 years ago
- How to build your first Spark application with MLlib, StructuredStreaming, GraphFrames, Datasets and so on? Answer is here!☆53Nov 5, 2019Updated 6 years ago
- Machine learning evaluation database☆24Feb 7, 2018Updated 8 years ago
- Another, hopefully better, implementation of ALS on Spark☆14May 20, 2015Updated 10 years ago
- Tensorflow implementation of a Neural Attention Model for Abstractive Summarization.☆10Jul 20, 2020Updated 5 years ago
- ☆20Dec 1, 2016Updated 9 years ago
- Cascading and Scalding wrapper for HBase with advanced read features☆54Feb 11, 2020Updated 6 years ago
- Pattern-of-Behavior Search Tool☆11Jun 20, 2022Updated 3 years ago
- Spatial error estimation and variable importance☆20Jan 30, 2025Updated last year
- Word2Vec - Google's word2vec in Scala using UMASS factorie library for better hacking and research.☆16Apr 7, 2014Updated 11 years ago
- ☆62Jul 11, 2019Updated 6 years ago
- Yahoo!'s topic modelling framework using Latent Dirichlet Allocation☆98Sep 21, 2011Updated 14 years ago
- JVM related exercises☆11Jul 16, 2017Updated 8 years ago
- Timeseries segmentation library☆12Mar 8, 2023Updated 3 years ago
- Spark Implementation of BIRCH Clustering algorithm☆13Feb 18, 2020Updated 6 years ago
- ☆20Nov 16, 2014Updated 11 years ago
- The STINGER in-memory graph store and dynamic graph analysis platform. Millions to billions of vertices and edges at thousands to millio…☆12Nov 10, 2015Updated 10 years ago
- ExtJS component for drawing trees (actually Directed Acyclic Graphs)☆21Aug 16, 2012Updated 13 years ago
- Memory consumption estimator for Scala/Java☆26Nov 24, 2014Updated 11 years ago
- MaRe leverages the power of Docker and Spark to run and scale your serial tools in MapReduce fashion.☆14Apr 12, 2022Updated 3 years ago
- 《Redis 命令速查表》☆14Nov 6, 2017Updated 8 years ago
- Ansible playbooks to help to deploy Apache Hadoop,Spark,Storm,Zookeeper,Elasticsearch,Azkaban,Flume,Hbase,Kafka,Kibana,Logstash☆10Mar 21, 2017Updated 9 years ago
- Manual for RStudio Server☆16Oct 5, 2013Updated 12 years ago
- Cassandra river for Elastic search.☆37Jul 15, 2013Updated 12 years ago
- Python scripts to facilitate easy working☆11Jun 24, 2024Updated last year
- An introduction to Hockey Visualization with D3.js☆15Mar 27, 2018Updated 7 years ago
- ☆24Jul 2, 2015Updated 10 years ago
- This project is for the notebooks, code, and data for the "Vocabulary Analysis of Job Descriptions" tutorial at PyData 2017 Seattle☆20Jul 12, 2017Updated 8 years ago
- A Multi Layer Perceptron (MLP) Artificial Neural Network (ANN) Framework Developed in C for Machine Learning (ML) and Deep Learning (DL)☆11May 4, 2025Updated 10 months ago
- Code for Springer Book: High Performance Distributed Computing: Case Studies with Hadoop, Scalding and Spark☆15Oct 6, 2017Updated 8 years ago
- Fetch and Convert NHL Play by Play game data☆13Oct 9, 2017Updated 8 years ago
- Using LANDSAT7 satellite images to predict population.☆12Jun 10, 2016Updated 9 years ago