DataHackIL / DataSets
A curated list of cool open datasets and APIs to use in machine learning driven projects.
☆27Updated 6 years ago
Related projects ⓘ
Alternatives and complementary repositories for DataSets
- Tool for sentiment analysis annotation☆11Updated last month
- DataHack Challenges - Challenges offered during our hackathon by top data companies.☆11Updated 4 years ago
- Spell correct entire sentences using nltk freqdist and symspell☆19Updated 7 years ago
- Identifying author profile including age and gender from texts☆17Updated 9 years ago
- A text similarity computation using minhashing and Jaccard distance on reuters dataset☆16Updated 6 years ago
- Relatively simple text classification powered by spaCy☆42Updated 9 years ago
- bamboolib - template for creating your own binder notebook☆21Updated 2 years ago
- MetroMaps Release☆16Updated 10 years ago
- ☆15Updated 6 years ago
- Random jupyter notebooks on data analysis and machine learning☆14Updated 6 years ago
- Tutorials for web scraping and crawling☆11Updated 4 years ago
- Orange Data Mining Homepage☆16Updated 5 years ago
- Minimum Entropy is a DDL hosted question/answer site for beginners who need answers to Data Science questions.☆16Updated 8 years ago
- A selection of business datasets☆17Updated 5 years ago
- Regex like pattern tree matching but on sentence's tree instead of Strings☆42Updated 6 years ago
- *SEM 2018: Learning Distributed Event Representations with a Multi-Task Approach☆22Updated 6 years ago
- A Large Automatically-Constructed Resource of Predicate Paraphrases☆43Updated 4 years ago
- Language Modelling, CMI vs Perplexity☆11Updated 6 years ago
- Experimental library for sampling and validating scikit-learn parameters☆10Updated 5 years ago
- A plugin for the GATE language technology framework for training and using machine learning models. Currently supports Mallet (MaxEnt, N…☆26Updated last year
- Twitter user classification tutorial at PyCon France 2016☆21Updated last year
- Tools and services for evaluating topic models☆15Updated 8 years ago
- All *.py scripts☆15Updated 5 years ago
- Glossary of Machine Learning terms☆16Updated 6 years ago
- Document clustering in Python☆30Updated 8 years ago
- Brand disambiguator for tweets to differentiate e.g. Orange vs orange (brand vs foodstuff), using NLTK and scikit-learn☆57Updated 11 years ago
- Scripts to take hand washing related text in (almost) any language and float it into a hand washing poster.☆9Updated 3 years ago
- This is an Object Oriented implementation of a Trie in python. The class contains setter and getter methods, and implements several usefu…☆14Updated 6 years ago