Assembly of fundamental statistics implemented based on Apache Spark
☆31Feb 11, 2016Updated 10 years ago
Alternatives and similar repositories for StatisticsOnSpark
Users that are interested in StatisticsOnSpark are comparing it to the libraries listed below
Sorting:
- Topic Modeling on Apache Spark☆94Mar 1, 2019Updated 7 years ago
- Spark MLlib code optimized to efficiently support sparse data☆51Dec 22, 2016Updated 9 years ago
- Yelp Restaurant Photo Classification - Kaggle competition☆12Apr 19, 2019Updated 6 years ago
- Winter Break Collaboratory DS Boot Camp during the academic year of 2017-2018☆14Feb 12, 2018Updated 8 years ago
- Gradient Boosting Enhanced with Step-Wise Feature Augmentation☆17Jan 13, 2021Updated 5 years ago
- ☆20Dec 1, 2016Updated 9 years ago
- Modeling Tanimoto distributions for RDKit☆18Feb 28, 2020Updated 6 years ago
- AugBoost: Gradient Boosting Enhanced with Step-Wise Feature Augmentation (2019 IJCAI paper)☆23Oct 22, 2019Updated 6 years ago
- hyb: a bioinformatics pipeline for the analysis of CLASH (crosslinking, ligation and sequencing of hybrids) data☆13Jul 12, 2024Updated last year
- Machine learning evaluation database☆24Feb 7, 2018Updated 8 years ago
- ☆62Jul 11, 2019Updated 6 years ago
- Parallel ML System - Bosen Java implementation☆28Jan 23, 2017Updated 9 years ago
- Program and links to the material for the GloBIAS Training School 2025, Kobe, Japan.☆22Oct 27, 2025Updated 4 months ago
- ☆11Oct 7, 2025Updated 4 months ago
- Bayesian Logistic Regression with Hyper-LASSO priors☆10Dec 14, 2025Updated 2 months ago
- Collected scripts for Pymol☆10Mar 18, 2015Updated 10 years ago
- ☆10Nov 15, 2015Updated 10 years ago
- WebGL based molecular viewer☆36Feb 13, 2026Updated 2 weeks ago
- Google's Java documentation generation tool. Static page generator which uses templates and has the possibility for versioning.☆11Apr 24, 2018Updated 7 years ago
- ☆14Jun 5, 2020Updated 5 years ago
- Data cleaning and exploration in Pandas via Jupyter notebook☆10Jun 17, 2019Updated 6 years ago
- Gromacs molecular dynamics simulation analysis scripts☆10Apr 5, 2022Updated 3 years ago
- ☆11Nov 30, 2024Updated last year
- RFM (recency, frequency, monetary) analysis☆13Aug 11, 2018Updated 7 years ago
- MotoGP/Linear Regression/Web Scraping☆10Mar 12, 2018Updated 7 years ago
- A simple example for PySpark based project.☆11Jun 3, 2016Updated 9 years ago
- Shared repo supporting the App Center client apps.☆13Nov 17, 2017Updated 8 years ago
- Linear MALDI-ToF simultaneous spectrum deconvolution and baseline removal☆12Jan 23, 2020Updated 6 years ago
- 16S rRNA Sequencing Data from the Human Microbiome Project☆10Oct 30, 2025Updated 4 months ago
- community detection in multiplex networks☆10Apr 23, 2016Updated 9 years ago
- Computer Science, Data Science and ML Fundamentals☆11May 30, 2025Updated 9 months ago
- Machine Learning based model to predict Insurance Pure Premium☆12Jan 24, 2017Updated 9 years ago
- PaNeV: an R package for a pathway-based network visualization☆10Aug 25, 2025Updated 6 months ago
- python simulation interface for molecular modeling. To cite this software publication: https://www.sciencedirect.com/science/article/pii/…☆13Aug 24, 2016Updated 9 years ago
- pymol implementation of WaterDock with Akshay Sridhar (@akshay-sridhar) and refactoring work by Patrick McCubbin (@mccubbinp)☆11Mar 25, 2024Updated last year
- A collection of code (mostly Jupyter notebooks) associated with entries on my blog☆12Oct 8, 2017Updated 8 years ago
- ☆15Jun 29, 2015Updated 10 years ago
- A Ruby wrapper for the World Bank Development Indicators API☆19Aug 9, 2012Updated 13 years ago
- Used by git2r, openssl☆10Jul 14, 2023Updated 2 years ago