swarajban / multithreadedWordCounting
word count of large file using prefix tree and parallel python processes
☆18Updated 11 years ago
Related projects ⓘ
Alternatives and complementary repositories for multithreadedWordCounting
- WebAnnotator is a tool for annotating Web pages. WebAnnotator is implemented as a Firefox extension (https://addons.mozilla.org/en-US/fi…☆48Updated 2 years ago
- Web page segmentation and noise removal☆55Updated 9 months ago
- An LSTM based query classification for Mandrain, implemented using Tensorflow☆20Updated 8 years ago
- Python code and data for the post "Word Segmentation, or Makingsenseofthis"☆16Updated 2 years ago
- A deep learning Chinese Word Segmentation toolkit☆46Updated 7 years ago
- A small test for recognizing persons with a word2vec model in German☆13Updated 9 years ago
- Experiment on text summarization techniques and exploring Tensorflow.☆15Updated 7 years ago
- Code for the CIKM 2013 paper "Discovering Coherent Topics Using General Knowledge"☆11Updated 10 years ago
- Named Entity Recognition demo with the NLTK☆14Updated 13 years ago
- Includes Code for Inference and Evaluation of Topic Models for Selectional Preferences☆16Updated last year
- A tool to segment text based on frequencies and the Viterbi algorithm "#TheBoyWhoLived" => ['#', 'The', 'Boy', 'Who', 'Lived']☆82Updated 8 years ago
- Python bindings for libwapiti☆66Updated 4 years ago
- Code for the ACL-2015 paper "Accurate Linear-Time Chinese Word Segmentation via Embedding Matching"☆38Updated 8 years ago
- Query-Document Relevance☆42Updated 9 years ago
- a Deep Learning based Speller☆27Updated 5 years ago
- SUMPY: a python automatic text summarization library☆55Updated 8 years ago
- word2vec variations☆8Updated 6 years ago
- Tools and services for evaluating topic models☆15Updated 8 years ago
- Paragraph Vector Implementation☆56Updated 7 years ago
- Scripts for an upcoming blog "Extractive vs. Abstractive Summarization" for RaRe Technologies.☆13Updated 7 years ago
- An Apache Lucene TokenFilter that uses a word2vec vectors for term expansion.☆24Updated 10 years ago
- Deep Character-Level Neural Machine Translation☆72Updated 7 years ago
- [NO LONGER MAINTAINED AS OPEN SOURCE - USE SCALETEXT.COM INSTEAD]☆109Updated 11 years ago
- ☆62Updated 10 years ago
- A fork of bitbucket.org/tunystom/rankpy, adapted for Python3 and dmitru/pines☆14Updated 8 years ago
- TensorFlow implementation of Hierarchical Attention Networks for Document Classification and some extension☆95Updated 7 years ago
- OpenTC is a text classification engine using several algorithms in machine learning☆26Updated 4 years ago
- Normalizes lexically ill-formed text to its most likely clean text, e.g. "c u thr 2nite!" -> "see you there tonight!".☆63Updated 9 years ago