vatsan / text_analytics_on_mpp
Collection of tutorials on text analytics/NLP, including vector space models, neural language models and topic models on the Pivotal MPP platform (Greenplum/HAWQ).
☆17Updated 8 years ago
Alternatives and similar repositories for text_analytics_on_mpp:
Users that are interested in text_analytics_on_mpp are comparing it to the libraries listed below
- A place for all things Pivotal & R☆25Updated 2 years ago
- A PL/Java Wrapper on Ark-Tweet-NLP (http://www.ark.cs.cmu.edu/TweetNLP/) - Twitter Parts-of-speech tagger in Postgres/Greenplum☆17Updated 10 years ago
- In-database parallel grid-search for XGBoost on Greenplum☆15Updated 6 years ago
- Using Word2Vec on lists and sets☆34Updated 9 years ago
- Data science repo to help others☆12Updated 9 years ago
- Repo for experiments on pyspark and sklearn☆79Updated 11 years ago
- A book on the applications of topic models.☆14Updated 7 years ago
- SmallK: very fast data clustering tools☆14Updated 5 years ago
- Some IPython notebooks I've created...☆29Updated 8 years ago
- Library for Geo-Inferencing in Twitter Data☆28Updated 8 years ago
- Python functions for popular relevance metrics (ndcg, err, etc)☆15Updated last year
- Dato/Turi DS Conf talk on NLP and Elasticsearch analysis of reviews, plus JS implementation☆42Updated 8 years ago
- This project contains the code to translate between Apache Spark and SFrame.☆21Updated 8 years ago
- An Apache Lucene TokenFilter that uses a word2vec vectors for term expansion.☆24Updated 10 years ago
- System for mining Wikipedia Usage data to read our collective mind☆21Updated 10 years ago
- Deploy sentiment analysis using Flask☆17Updated 5 years ago
- spy on your random forests☆19Updated 4 years ago
- code and slides for my PyGotham 2016 talk, "Higher-level Natural Language Processing with textacy"☆15Updated 8 years ago
- Healthcare Twitter Analysis☆26Updated 8 years ago
- Additional files for the Otto Group Challenge hosted by Kaggle☆36Updated 9 years ago
- Simple FieldCache based query introspection Solr Search Component - solves the 'red sofa' problem☆12Updated 3 weeks ago
- scikit-learn addon to operate on set/"group"-based features☆41Updated 8 years ago
- A collection of examples illustrating data processing, data science, and machine learning on the Pivotal Greenplum and HAWQ MPP databases☆20Updated 8 years ago
- Tuffy, a Markov Logic Network solver☆24Updated 10 years ago
- Word2Vec models with Twitter data using Spark. Blog:☆65Updated 6 years ago
- ☆39Updated 8 years ago
- Semanticizest: dump parser and client☆20Updated 8 years ago
- Tutorial for Deploying Anaconda Cluster and PySpark on top of Red Hat Storage GlusterFS☆8Updated 10 years ago
- Visualization of text sentiment using deep learning☆43Updated 8 years ago
- Scripts to Analyze Pronto's Data Release☆24Updated 9 years ago