gr33ndata / dmoz-urlclassifier
Preparing DMOZ dataset for my n-Gram LM-based URL classification research
☆32Updated 10 years ago
Alternatives and similar repositories for dmoz-urlclassifier
Users that are interested in dmoz-urlclassifier are comparing it to the libraries listed below
Sorting:
- Algorithms for URL Classification☆19Updated 10 years ago
- Code base for representation learning of very short texts, such as tweets. By Cedric De Boom, IBCN, Ghent University, Belgium.☆36Updated 9 years ago
- ☆21Updated 8 years ago
- Active Learning for text classification using scikit-learn☆24Updated 5 years ago
- Reduction is a python script which automatically summarizes a text by extracting the sentences which are deemed to be most important.☆55Updated 10 years ago
- Labeled examples from wiki dumps in Python☆67Updated 8 years ago
- Query-Document Relevance☆42Updated 10 years ago
- Some add-on modules to networkx library☆78Updated 4 years ago
- Semantic embeddings of entities☆66Updated 8 years ago
- Paragraph Vector Implementation☆56Updated 7 years ago
- [NO LONGER MAINTAINED AS OPEN SOURCE - USE SCALETEXT.COM INSTEAD]☆108Updated 11 years ago
- A python wrapper around the ZPar parser for English.☆49Updated 4 years ago
- Multiclass Naive Bayes SVM (NB-SVM) - Learns a multiclass classifier (OneVsRest) based on word ngrams.☆35Updated 9 years ago
- Text classification with Reuters-21578 datasets using Gensim Word2Vec and Keras LSTM☆44Updated 7 years ago
- Dynamic Topic Model (based upon code released by David Blei at http://www.cs.princeton.edu/~blei/topicmodeling.html)☆31Updated 7 years ago
- A scikit-learn style implementation of NBSVM☆17Updated 9 years ago
- Experiment on text summarization techniques and exploring Tensorflow.☆15Updated 8 years ago
- Collection of functions and scripts for text retrieval in Python: Document collection preprocessing, Feature Selection, Indexing, Query p…☆44Updated 12 years ago
- Knowledge extraction from web data☆92Updated 7 years ago
- Simple factoid question answering system☆23Updated 9 years ago
- Using Centroids of Word Embeddings and Word Mover's Distance for Biomedical Document Retrieval in Question Answering.☆14Updated 7 years ago
- Reimplementation of deepwalk algorithm from https://github.com/phanein/deepwalk☆38Updated 9 years ago
- creating a dataset for person name disambiguation using combination of sources like wikipedia, DBLP authors and PPDB.☆52Updated 7 years ago
- SUMPY: a python automatic text summarization library☆55Updated 9 years ago
- Word vectors☆64Updated 6 years ago
- Normalizes lexically ill-formed text to its most likely clean text, e.g. "c u thr 2nite!" -> "see you there tonight!".☆63Updated 9 years ago
- POC IDS anomaly detection engine built with iPython notebook, matplotlib, pandas, numpy, scikit-learn, d3.js, hyperloglog implementation,…☆79Updated 10 years ago
- Code for "Performance shootout between nearest-neighbour libraries": http://radimrehurek.com/2013/11/performance-shootout-of-nearest-neig…☆99Updated 9 years ago
- Extract opionion phrases from user reviews☆63Updated 10 years ago
- A toolkit for generating paraphrase vector representations for words in context☆23Updated 9 years ago