black-tea / data-projects
A compendium of data projects and associated blog posts
☆10Updated 5 years ago
Alternatives and similar repositories for data-projects:
Users that are interested in data-projects are comparing it to the libraries listed below
- Spark NLP for Streamlit☆15Updated 3 years ago
- ☆16Updated 4 years ago
- MLinProduction SageMaker workshop hosted in April 2020☆15Updated 4 years ago
- Automatic Text Summarization with Machine Learning☆16Updated 7 years ago
- PySpark phonetic and string matching algorithms☆39Updated last year
- Clustering analysis of one million tweets using scikit-learn, including basic benchmarking of various clustering algorithms☆36Updated 8 years ago
- Quora Kaggle Competition : Natural Language Processing using word2vec embeddings, scikit-learn and xgboost for training☆18Updated 6 years ago
- Package that returns a company embedding given a company name☆44Updated 4 years ago
- classify a job description (or noisy job title) into a ONET job title☆18Updated 8 years ago
- Watson OpenScale tutorials including sample models, notebooks and applications☆22Updated 2 years ago
- store my personal project☆22Updated 4 years ago
- Webscikit is a set of tools to run a webserver as a JSON Webservice for scikit-learn predictions. It comes with two examples: boston and …☆9Updated 7 years ago
- A Scalable Data Cleaning Library for PySpark.☆26Updated 5 years ago
- Build intelligent data-driven applications with minimal effort. Sentence Clustering, Topics Extraction, Text Similarity, Opinion Summariz…☆40Updated 5 years ago
- Slides, code and more for my class: Data Analytics and Machine Learning on Big Data☆8Updated 7 years ago
- Large-scale Graph Mining with Spark☆40Updated 6 years ago
- Augment IBM Watson Natural Language Understanding APIs with a configurable mechanism for text classification, uses Watson Studio.☆46Updated 5 years ago
- ☆18Updated 3 years ago
- Notebooks configured to be run with Binder, usually found on my blog.☆42Updated last year
- Pyspark in Google Colab: A simple machine learning (Linear Regression) model☆36Updated 5 years ago
- Follow the Lumiata Tech Blog on Medium!☆21Updated last year
- In-Session Personalization Workshop for eCommerce, April 2021, and the MICES Workshop in June 2021.☆22Updated 3 years ago
- Topic Modelling for Humans☆22Updated 6 years ago
- This repo contains my hackathon solutions☆38Updated 2 years ago
- Instant search for and access to many datasets in Pyspark.☆34Updated 2 years ago
- Predict the poverty of households in Costa Rica using automated feature engineering.☆23Updated 4 years ago
- Text Similarity Search Application using Modern NLP and Elasticsearch☆30Updated 4 years ago
- ☆17Updated 7 years ago
- Text summarization algorithm for the Capstone Project at Springboard code bootcamp☆54Updated 2 years ago
- Evolution of word vectors from long, sparse, and 1-hot to short, dense, and context sensitive☆28Updated last year