Featureselection methods as Spark MLlib Pipelines
☆31Apr 29, 2018Updated 7 years ago
Alternatives and similar repositories for spark-FeatureSelection
Users that are interested in spark-FeatureSelection are comparing it to the libraries listed below
Sorting:
- This package contains a generic implementation of greedy Information Theoretic Feature Selection (FS) methods. The implementation is base…☆135May 5, 2022Updated 3 years ago
- Machine learning enhancements to Spark MlLib☆20Mar 19, 2015Updated 10 years ago
- a benchmark to test scalability of xgboost4j-spark and relevant projects☆22Dec 20, 2019Updated 6 years ago
- My answers to the exercises from the book "Scala for the impatient" (2nd edition) -- 2017.☆19May 24, 2017Updated 8 years ago
- API for converting JVM objects to representations by MIME type, for the Jupyter ecosystem.☆25Jan 16, 2020Updated 6 years ago
- write WeApp with scalajs☆19Dec 31, 2018Updated 7 years ago
- Spark ML Lib serving library☆48May 29, 2018Updated 7 years ago
- 主要解决ctr预估工程中的特征选择,特征编号(特征离散),单特征auc和logloss这3个问题.☆20Mar 30, 2017Updated 8 years ago
- Spark Extension : ML transformers, SQL aggregations, etc that are missing in Apache Spark☆146Jan 26, 2016Updated 10 years ago
- Some popular algorithms(dbscan,knn,fm etc.) on spark☆32May 29, 2018Updated 7 years ago
- Scala framework for collecting performance metrics and conducting sound experimental benchmarking.☆13Nov 19, 2025Updated 3 months ago
- Visualize streaming machine learning in Spark☆177Jun 29, 2017Updated 8 years ago
- Open source formats for scalable genomic processing systems using Avro. Apache 2 licensed.☆41Feb 13, 2026Updated 2 weeks ago
- Machine Learning Pipeline Stages for Spark (exposed in Scala/Java + Python)☆74Nov 9, 2023Updated 2 years ago
- Magic to help Spark pipelines upgrade☆34Sep 29, 2024Updated last year
- EncryCore node reference implementation☆15Apr 2, 2020Updated 5 years ago
- Sangria akka-streams integration☆11Feb 8, 2026Updated 3 weeks ago
- Crime correlation anaysis☆10Aug 8, 2018Updated 7 years ago
- REDCap Electronic Data - I (Ingester/Integrator/Importer)☆10Oct 15, 2018Updated 7 years ago
- Scala for the Impatient (2nd edition) - My Solutions☆10Dec 22, 2017Updated 8 years ago
- Subset Met Office MOGREPS-UK and UKV on AWS EC2☆12Oct 22, 2021Updated 4 years ago
- Course Materials for the MMCi Practical Data Science Course☆19Apr 10, 2020Updated 5 years ago
- Source code for 'Pro Spark Streaming' by Zubair Nabi☆10Mar 27, 2017Updated 8 years ago
- ☆11May 21, 2021Updated 4 years ago
- Code for the "Sample-efficient Integration of New Modalities into Large Language Models" paper☆16Sep 8, 2025Updated 5 months ago
- Simple role for deploying Elixir Exrm releases.☆10Jan 28, 2016Updated 10 years ago
- An example of a multiple workspace deployment with reusable modules.☆13May 28, 2025Updated 9 months ago
- ☆11Dec 23, 2017Updated 8 years ago
- An Apache Mesos Framework that allows for replaying load over and over and over (and over) again☆10Aug 10, 2015Updated 10 years ago
- This is a POC to test pgTAP (I use Docker image) to write and execute PL/pgSQL - SQL Procedural Language.☆11Mar 1, 2018Updated 8 years ago
- A benchmark tool for lakehouses.☆14Mar 12, 2023Updated 2 years ago
- First-order knowledge compilation for lifted probabilistic inference☆11Jun 14, 2017Updated 8 years ago
- A Framework for building Distributed Consensus Protocols☆10Oct 13, 2017Updated 8 years ago
- Hello world project templates for getting started quickly with Nix☆12Oct 15, 2023Updated 2 years ago
- ADT support for Flink with Shapeless☆12Jan 11, 2020Updated 6 years ago
- Transitmap is an interactive realtime visualisation of all public transport in Sweden.☆11Jun 6, 2025Updated 8 months ago
- Ansible playbook for managing Galaxy infrastructure. For the playbook managing Galaxy itself, see https://github.com/galaxyproject/usegal…☆12Feb 23, 2026Updated last week
- A system for managing files and file replicas across many diverse sites☆11Mar 23, 2023Updated 2 years ago
- Some Avro operations in Scala☆10Feb 9, 2026Updated 3 weeks ago