使用sklearn做特征工程
☆178Jul 19, 2018Updated 7 years ago
Alternatives and similar repositories for sklearn-feature-engineering
Users that are interested in sklearn-feature-engineering are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 比赛常用的特征工程、类别不平衡处理方法☆17Aug 16, 2018Updated 7 years ago
- 机器学习的特征工程,包括特征抽取、特征预处理、特征选择、特征降维。☆25Feb 25, 2019Updated 7 years ago
- CCF2018 数据挖掘 机器学习 智能匹配 特征工程☆50Sep 27, 2019Updated 6 years ago
- 数据特征工程、各种机器学习回归模型、回归数据预处理☆43Jan 14, 2020Updated 6 years ago
- 整理所有特征工程用到的方法,为了复用☆11Jan 11, 2021Updated 5 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- CCF BDCI 2022比赛 返乡发展人群预测赛题 Baseline 数据挖掘(特征工程+集成学习)队伍排名39/2297☆12Mar 15, 2024Updated 2 years ago
- 员工离职预测训练赛☆10Aug 25, 2017Updated 8 years ago
- 对截止至2017年7月17日的债券违约事件进行梳理归因,并寻找宏观流动性影响因素,组成数据集。运用Lasso回归进行特征提取后,输入带L2惩罚项LR、SVM、NN、GBDT、RF等机器学习模型进行违约预测,得出GBDT预测效果最好以及特征工程对线性模型预测效果具有重要性的结…☆58Mar 7, 2019Updated 7 years ago
- Multi Channel Attribution☆10Mar 7, 2017Updated 9 years ago
- [译] 面向机器学习的特征工程☆2,552Aug 25, 2023Updated 2 years ago
- 通过将对上市公司招股说明书情绪分析的结果与常用财务指标、企业科研指标等结合,综合使用多种分类模型:传统LR、随机森林、XGB、LGB集成学习模型对新上市公司破发情况进行学习和预测,筛选重要特征,并由此来得到一个新股破发分类器。☆14Aug 26, 2023Updated 2 years ago
- 一些个人学习笔记☆60Apr 6, 2021Updated 4 years ago
- PCA和LDA进行数据降维☆38Apr 5, 2020Updated 5 years ago
- 用python和sklearn两种方法实现李航《统计学习方法》中的算法☆340Jul 3, 2018Updated 7 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- 基于自构造函数的特征提取评分项目(缺失值处理,单变量相关性分析,特征评分,降维)☆15Jul 21, 2017Updated 8 years ago
- TalkingData AdTracking Fraud Detection Challenge on Kaggle Competition☆13Sep 24, 2018Updated 7 years ago
- 66 classic and common interview problems from 《剑指offer》 with multiple-method-CPP solutions, and common data structure summary, etc☆20Mar 10, 2021Updated 5 years ago
- 常见的数据预处理,包括数据加载、缺失值&异常值处理、描述性变量转换为数值型、训练测试集划分、数据规范化☆48Sep 19, 2023Updated 2 years ago
- Weight of Evidence,基于iv值最大思想求最优分箱☆15Oct 24, 2019Updated 6 years ago
- 关于综合评价一个评分卡模型的方法总结(附代码)☆74Feb 17, 2019Updated 7 years ago
- 天池大数据竞赛 千里马大赛 风险识别与预测赛题 Top5☆14May 16, 2019Updated 6 years ago
- featselector是一个基于统计分析和模型选择的特征选择器.☆14Mar 4, 2019Updated 7 years ago
- 为天池数据竞赛写的自动化特征工程和训练工具,可以通过配置的方式从mysql数据库中生成特征。同时重新封装了数据,特征和模型,使其可以被自动化测试系统识别及调用。待完成的工作:自动化测试系统的调度关键技术。☆12Dec 6, 2015Updated 10 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- ☆10Dec 12, 2019Updated 6 years ago
- 根据GBDT衍生变量,并对衍生后的变量进行应用☆21Mar 15, 2020Updated 6 years ago
- 2020腾讯广告算法大赛 Rank19☆18Aug 1, 2020Updated 5 years ago
- A code repository for my Tianchi big data competition.☆117Mar 12, 2018Updated 8 years ago
- This is a group project for E-commerce repeat buyers purchase prediction using machine learning while accounting for imbalance outcome fo…☆12Dec 29, 2020Updated 5 years ago
- 常用的特征选择方法☆67Jul 4, 2022Updated 3 years ago
- Comparison of XGBoost and LightGBM (speed, accuracy and complexity)☆21Dec 8, 2018Updated 7 years ago
- Code for KDD CUP 2019 Auto-ML track☆21Jul 25, 2019Updated 6 years ago
- The ts302_team final solution to the KDD CUP 2019 AutoML Track problem.☆15Jul 3, 2020Updated 5 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- 同步Hive数据仓库数据到Elasticsearch的小工具☆20Feb 3, 2018Updated 8 years ago
- Tencent Advertisement Algorithm Competition 2020 / 2020腾讯广告算法大赛☆25Jun 29, 2020Updated 5 years ago
- 分别基于statsmodels和scikit-learn实现两种可用于sklearn pipeline的 LogisticRegression,并输出相应的报告☆21May 21, 2023Updated 2 years ago
- Alpha mining with DEAP-based genetic programming.☆11Jul 7, 2023Updated 2 years ago
- 在sklearn下,几种常用的特征选择方法☆41Jan 21, 2016Updated 10 years ago
- ☆42Jul 28, 2021Updated 4 years ago
- 通过对于现有开源分布式机器学习工具的整合(主要是基于参数服务器的logistic regression,xgboost,FFM,FM ),打造一个工业级的,可以线上使用的点击率预估流水线☆26Jun 6, 2017Updated 8 years ago