DataHackIL / DataSetsLinks
A curated list of cool open datasets and APIs to use in machine learning driven projects.
☆27Updated 7 years ago
Alternatives and similar repositories for DataSets
Users that are interested in DataSets are comparing it to the libraries listed below
Sorting:
- Document clustering in Python☆30Updated 9 years ago
- A Multilingual Latent Dirichlet Allocation (LDA) Pipeline with Stop Words Removal, n-gram features, and Inverse Stemming, in Python.☆83Updated last year
- Very basic introduction to pyspark☆15Updated 8 years ago
- python library implementing ensemble methods for regression, classification and visualisation tools including Voronoi tesselations.☆126Updated 5 years ago
- Relatively simple text classification powered by spaCy☆41Updated 10 years ago
- NLP pipeline using word2vec (preprocessing/embedding/prediction/clustering)☆116Updated last year
- Python script to generate fake datasets optimized for testing machine learning/deep learning workflows☆50Updated 6 years ago
- Repo for my talk at the PyData Berlin 2017 conference☆65Updated 8 years ago
- Introduction to web scraping and text mining☆48Updated 6 years ago
- Materials for the workshop Advanced Text Analysis with SpaCy and Scikit-Learn, given at NYU during NYCDH Week 2017, at PyData NYC in Nov.…☆83Updated 3 years ago
- Implementation of a text clustering algorithm using Kmeans clustering in order to derive quick insights from unstructured text☆126Updated last year
- Start your journey into social media analysis of politicans by using Python (Tutorial)☆21Updated 6 years ago
- Slides and materials for most of my talks by year☆93Updated 2 years ago
- Stanford Named Entity Recognizer (NER) - Python Wrapper☆81Updated 5 years ago
- Awesome list of AI Fairness tools, research papers, tutorials and any other relevant materials. For use by data scientists, AI engineers …☆16Updated 6 years ago
- Negation detection NLP tool. If you use the code, please cite George Gkotsis, Sumithra Velupillai, Anika Oellrich, Harry Dean,…☆55Updated 8 years ago
- 💥 Browser-based slides or PDFs of our talks and presentations☆94Updated 7 years ago
- Presentations & notebooks from our talks /workshops/meetups/etc☆24Updated 7 years ago
- A library for topic modeling and browsing☆89Updated 7 years ago
- Clinical NLP Analysis with Elasticsearch and Kibana☆35Updated 6 years ago
- Delta believes in building technical capacity all over the world. We believe that data is powerful, and that anybody should be able to ha…☆25Updated 7 years ago
- Resources for the Data Mining for Bussiness and Governance course.☆57Updated 5 years ago
- 💫 Jupyter notebooks for spaCy examples and tutorials☆288Updated 7 years ago
- allennlp tutorial for O'Reilly AI Conference, September 2019☆22Updated 6 years ago
- Notes for Data Science 350 Class☆24Updated 8 years ago
- Pandas integration with sklearn☆21Updated 9 years ago
- Training time estimation for scikit-learn algorithms☆124Updated 4 years ago
- NLP tutorial for the Berlin Data Science Retreat☆41Updated 9 years ago
- A very brief introduction to Natural Language Processing programming in Python☆149Updated 2 years ago
- An introduction to using spaCy for NLP and machine learning☆193Updated 3 years ago