sramirez/spark-MDLP-discretization

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/sramirez/spark-MDLP-discretization)

sramirez / spark-MDLP-discretization

Spark implementation of Fayyad's discretizer based on Minimum Description Length Principle (MDLP)

☆43

Alternatives and similar repositories for spark-MDLP-discretization

Users that are interested in spark-MDLP-discretization are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

sramirez / spark-infotheoretic-feature-selection
View on GitHub
This package contains a generic implementation of greedy Information Theoretic Feature Selection (FS) methods. The implementation is base…
☆134May 5, 2022Updated 4 years ago
LIDIAgroup / SparkFeatureSelection
View on GitHub
Generic implementation of Information Theory-based Feature Selection methods. It also contains an Entropy Minimization Discretization imp…
☆19Jul 21, 2014Updated 12 years ago
modal-inria / MixtComp
View on GitHub
Model-based clustering package for mixed data
☆13May 21, 2026Updated 2 months ago
wxhC3SC6OPm8M1HXboMy / spark-mrmr-feature-selection
View on GitHub
Machine learning enhancements to Spark MlLib
☆20Mar 19, 2015Updated 11 years ago
kennethschen / Classification-Confidence-Intervals
View on GitHub
PyPI package to calculate comprehensive confidence intervals for classification positive rate, precision, NPV, and recall using a labeled…
☆10Jul 6, 2023Updated 3 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
liuhongjiang / blog_code
View on GitHub
Code for my own blog
☆10Nov 7, 2013Updated 12 years ago
Google-Health-API / sdk-i18n
View on GitHub
Fitbit SDK example application.
☆10Mar 7, 2023Updated 3 years ago
Lewuathe / dllib
View on GitHub
dllib is a distributed deep learning library running on Apache Spark
☆32Oct 26, 2017Updated 8 years ago
manuparra / MasterDatCom_BDCC_Practice
View on GitHub
Practice and Workshop on BigData and Cloud Computing using Docker Containers and OpenNebula. HDFS, hadoop and spark+R
☆11Mar 16, 2017Updated 9 years ago
hortonworks-gallery / hdp22-twitter-demo
View on GitHub
Monitor Twitter stream for S&P 500 companies to identify & act on unexpected increases in tweet volume
☆40Mar 6, 2016Updated 10 years ago
MLnick / glint-fm
View on GitHub
Factorization Machines on Spark and Glint
☆25Nov 7, 2016Updated 9 years ago
josemarialuna / ClusterIndices
View on GitHub
This package contains the code for executing clustering validity indices in Spark. The package includes BD-Silhouette, BD-Dunn, Davies-Bo…
☆10Oct 29, 2018Updated 7 years ago
rjagerman / glint
View on GitHub
Glint: High performance scala parameter server
☆170Jul 20, 2018Updated 8 years ago
AnatoliiPotapov / squad
View on GitHub
☆10Aug 17, 2017Updated 8 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
sramirez / fast-mRMR
View on GitHub
An improved implementation of the classical feature selection method: minimum Redundancy and Maximum Relevance (mRMR).
☆83Apr 1, 2022Updated 4 years ago
breezedeus / breezedeus.github.io
View on GitHub
Breezedeus's Blog
☆17Jul 4, 2023Updated 3 years ago
PacktPublishing / Mastering-Apache-Spark-2x
View on GitHub
Mastering Apache Spark 2x, published by Packt
☆17Jan 30, 2023Updated 3 years ago
Open-Network-Insight / oni-ml
View on GitHub
The machine learning component of Open Network Insight: scalable analytics combining spark for big data and C / MPI for high performance …
☆13Nov 9, 2016Updated 9 years ago
uzh / fox
View on GitHub
A framework for PSL inference.
☆22Nov 9, 2015Updated 10 years ago
Leemoonsoo / zeppelin-examples
View on GitHub
Zeppelin notebook examples
☆25Feb 18, 2016Updated 10 years ago
viirya / SparkAffinityPropagation
View on GitHub
Affinity Propagation on Spark
☆20May 31, 2021Updated 5 years ago
selvinsource / spark-pmml-exporter-validator
View on GitHub
Using JPMML Evaluator to validate the PMML models exported from Spark
☆19May 1, 2017Updated 9 years ago
yanboliang / spark-vlbfgs
View on GitHub
Vector-free L-BFGS implementation for Spark MLlib
☆46Jun 23, 2017Updated 9 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
glennmurray / sparkstr
View on GitHub
Spark Streaming jobs.
☆11Mar 10, 2015Updated 11 years ago
claesenm / spark-ml-inventory
View on GitHub
A curated inventory of machine learning methods available on the Apache Spark platform, both in official and third party libraries.
☆64Apr 16, 2017Updated 9 years ago
FlinkML / flink-parameter-server
View on GitHub
Parameter Server implementation in Apache Flink
☆56Oct 15, 2018Updated 7 years ago
Samsung / veles.sound_feature_extraction
View on GitHub
Distributed machine learning platform
☆28Aug 20, 2015Updated 10 years ago
mengxr / spark-als
View on GitHub
Another, hopefully better, implementation of ALS on Spark
☆14May 20, 2015Updated 11 years ago
aliyun / kafka-connect-oss
View on GitHub
Kafka Connect suite of connectors for OSS
☆30Jul 1, 2022Updated 4 years ago
CODAIT / aardpfark
View on GitHub
A library for exporting Spark ML models and pipelines to PFA
☆55Nov 21, 2018Updated 7 years ago
manuparra / volleyball-performance-analysis
View on GitHub
R package to Volleyball Performance Analysis and Visualization
☆11Apr 22, 2017Updated 9 years ago
Data-Science-Big-Data-Research-Lab / MetaGen
View on GitHub
MetaGen: A framework for metaheuristic development and hyperparameter optimization in machine and deep learning
☆32Mar 21, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
MarcKaminski / spark-FeatureSelection
View on GitHub
Featureselection methods as Spark MLlib Pipelines
☆30Apr 29, 2018Updated 8 years ago
sea-shunned / hawks
View on GitHub
A package for generating synthetic clusters with control over "difficulty"
☆24Apr 24, 2026Updated 3 months ago
NestorRV / SOUL
View on GitHub
SOUL: Scala Oversampling and Undersampling Library.
☆13Apr 11, 2019Updated 7 years ago
deepspark / deepspark
View on GitHub
Deeplearning framework running on Spark
☆63Dec 16, 2023Updated 2 years ago
implydata / druid-datagenerator
View on GitHub
A data generator for Apache Druid
☆12Mar 26, 2025Updated last year
lisette-espin / pychimerge
View on GitHub
ChiMerge: Discretization of Numeric Attributes
☆41Mar 15, 2016Updated 10 years ago
SUTDNLP / NNNamedEntity
View on GitHub
Named Entity Recognition (NER) models (neural and sparse) implemented based on package LibN3L
☆20Jan 2, 2017Updated 9 years ago