black-tea / data-projectsLinks
A compendium of data projects and associated blog posts
☆10Updated 5 years ago
Alternatives and similar repositories for data-projects
Users that are interested in data-projects are comparing it to the libraries listed below
Sorting:
- classify a job description (or noisy job title) into a ONET job title☆19Updated 8 years ago
- ☆16Updated 4 years ago
- Package that returns a company embedding given a company name☆46Updated 5 years ago
- Material for UW Extension Data Science 350☆19Updated 7 years ago
- This repository contains machine learning related work for the corpus to graph project, including Jupyter research notebooks and a Flask …☆46Updated 8 years ago
- A code-based tutorial for production level data streaming with PySpark plus Optimus for data cleaning, Confluent Kafka, & Apache Drill u…☆26Updated 6 years ago
- Tutorial code and data for the entity resolution workshops.☆45Updated 10 years ago
- Build intelligent data-driven applications with minimal effort. Sentence Clustering, Topics Extraction, Text Similarity, Opinion Summariz…☆40Updated 5 years ago
- Extracting LinkedIn comments from any post and export it to Excel file☆23Updated 6 years ago
- Automatically labeling training data☆107Updated 6 years ago
- Build a deep learning model for predicting the named entities from text.☆56Updated 6 years ago
- Propensity models make true predictions about a customer’s future behavior. With propensity models you can truly anticipate a customer's …☆17Updated 6 years ago
- An example on how to train supervised classifiers for multi-label text classification using sklearn pipelines☆110Updated 7 years ago
- Tutorial for Topic Modelling using PySpark and Spark NLP☆17Updated 5 years ago
- Binding the GDELT universe in a Spark environment☆25Updated 2 years ago
- Materials for Convolutional Methods for Text workshop at PyCon2017☆11Updated 8 years ago
- Codes related to Lord of the Machines hackathon☆10Updated 7 years ago
- Dice.com repo to accompany the dice.com 'Vectors in Search' talk by Simon Hughes, from the Activate 2018 search conference, and the 'Sear…☆86Updated 4 years ago
- Topic modelling on financial news with Natural Language Processing☆59Updated 7 years ago
- Resources for the Data Mining for Bussiness and Governance course.☆54Updated 4 years ago
- Clustering analysis of one million tweets using scikit-learn, including basic benchmarking of various clustering algorithms☆36Updated 8 years ago
- NLP pipeline using word2vec (preprocessing/embedding/prediction/clustering)☆115Updated last year
- Topic Modelling for Humans☆22Updated 7 years ago
- Models and Pipelines for the Spark NLP library☆113Updated 3 years ago
- Slides, code and more for my class: Data Analytics and Machine Learning on Big Data☆8Updated 7 years ago
- Using NLP to cluster reddit user comments by topics☆14Updated 8 years ago
- Quora Kaggle Competition : Natural Language Processing using word2vec embeddings, scikit-learn and xgboost for training☆18Updated 6 years ago
- Production Machine Learning Pipeline for Text Classification with fastText☆32Updated 4 years ago
- Instant search for and access to many datasets in Pyspark.☆34Updated 2 years ago
- Tutorial on deploying machine learning models to production☆59Updated 5 years ago