DataHackIL / DataSetsLinks
A curated list of cool open datasets and APIs to use in machine learning driven projects.
☆27Updated 7 years ago
Alternatives and similar repositories for DataSets
Users that are interested in DataSets are comparing it to the libraries listed below
Sorting:
- Very basic introduction to pyspark☆15Updated 8 years ago
- DataHack Challenges - Challenges offered during our hackathon by top data companies.☆12Updated 5 years ago
- Implementation of a text clustering algorithm using Kmeans clustering in order to derive quick insights from unstructured text☆126Updated last year
- Document clustering in Python☆30Updated 9 years ago
- sciblox - Easier Data Science and Machine Learning☆50Updated 8 years ago
- Create a graph from a community on Twitter with Tweepy, NetworkX, and Plotly.☆28Updated 7 years ago
- Notes for Data Science 350 Class☆24Updated 8 years ago
- All the notebooks for the analysis of Emotional Arcs within the Project Gutenberg corpus, see "The emotional arcs of stories are dominate…☆32Updated last year
- ☆35Updated 2 years ago
- A Multilingual Latent Dirichlet Allocation (LDA) Pipeline with Stop Words Removal, n-gram features, and Inverse Stemming, in Python.☆83Updated last year
- Content associated with a PyData Seattle 2017 tutorial on Unevenly spaced time series analysis of The Simpsons using pandas☆15Updated 8 years ago
- Python script to generate fake datasets optimized for testing machine learning/deep learning workflows☆50Updated 6 years ago
- Train word embeddings with Gensim and vizualize them with TensorBoard☆34Updated 6 years ago
- Introduction to web scraping and text mining☆48Updated 5 years ago
- 💥 Browser-based slides or PDFs of our talks and presentations☆94Updated 6 years ago
- Sample of Python codes from mathematical problems☆110Updated 6 years ago
- Slides and materials for most of my talks by year☆92Updated 2 years ago
- A traits predictor using Python☆15Updated 7 years ago
- ☆103Updated 2 years ago
- Text Preprocessing in Python☆19Updated 8 years ago
- Crowd Course Data Science course project☆27Updated 9 years ago
- NLP pipeline using word2vec (preprocessing/embedding/prediction/clustering)☆116Updated last year
- Extract synonyms, keywords from sentences using modified implementation of Aho Corasick algorithm☆40Updated 8 years ago
- Relatively simple text classification powered by spaCy☆41Updated 10 years ago
- In this Facebook live code along session with Hugo Bowne-Anderson, you're going to check out Google trends data of keywords 'diet', 'gym'…☆44Updated 7 years ago
- A research into the workflow for Kaggle competition (and data science in general) collaboratively☆48Updated 8 years ago
- ☆32Updated 8 years ago
- Implementation of several black-box optimisation methods to tune hyperparameters of machine learning models.☆78Updated 7 years ago
- OptimalFlow is an omni-ensemble and scalable automated machine learning Python toolkit, which uses Pipeline Cluster Traversal Experiments…☆27Updated last year
- Clinical NLP Analysis with Elasticsearch and Kibana☆35Updated 6 years ago