swarajban / multithreadedWordCounting
word count of large file using prefix tree and parallel python processes
☆18Updated 11 years ago
Alternatives and similar repositories for multithreadedWordCounting:
Users that are interested in multithreadedWordCounting are comparing it to the libraries listed below
- An easy-install script for LibShortText☆27Updated 10 years ago
- iCQA - Intelligent Community Question Answering Framework☆31Updated 8 years ago
- Non-Overlapping Aho-Corasick Python extension, for Python 2 (str and unicode) and Python 3☆51Updated 9 years ago
- Python toolkit for ranking experiments on sentence/summary data☆24Updated last year
- Named Entity Recognition demo with the NLTK☆13Updated 13 years ago
- Distributed text analysis suite based on Celery☆95Updated 2 years ago
- tools for chinese word segmentation and pos tagging written in python☆38Updated 11 years ago
- Nonparametric timeseries classification for Twitter trending topic detection (MEng thesis)☆119Updated 11 years ago
- Keyword query search engine on semantic store/linked data web☆9Updated 9 years ago
- Yet another Chinese word segmentation package based on character-based tagging heuristics and CRF algorithm☆243Updated 12 years ago
- auto generate chinese words in huge text.☆91Updated 10 years ago
- tyccl(同义词词林) is a ruby gem that provides friendly functions to analyse similarity between Chinese Words.☆46Updated 11 years ago
- ☆8Updated 8 years ago
- *Deprecated* A fast and accurate part-of-speech tagger for TextBlob.☆102Updated 9 years ago
- Links parts of input text to Wikipedia articles☆16Updated 12 years ago
- Includes Code for Inference and Evaluation of Topic Models for Selectional Preferences☆16Updated last year
- An Apache Lucene TokenFilter that uses a word2vec vectors for term expansion.☆24Updated 10 years ago
- A program to correct non-word spelling error in sentences using ngram MAP Language Models, Noisy Channel Model, Error Confusion Matrix an…☆53Updated 4 years ago
- Show summary of a large number of URLs in a Jupyter Notebook☆17Updated 3 years ago
- A small test for recognizing persons with a word2vec model in German☆13Updated 9 years ago
- A Python implementation of Probabilistic Context-Free Grammar Parser.☆69Updated 11 years ago
- Code for the ACL-2015 paper "Accurate Linear-Time Chinese Word Segmentation via Embedding Matching"☆38Updated 9 years ago
- Target-dependent Twitter Sentiment Classification with Rich Automatic Features☆22Updated 8 years ago
- An extension of word2vec to efficiently represent new text as vectors. New text can be query, sentence and paragraph.☆67Updated 7 years ago
- Awesome deep learning based NLP papers and survey, also some awesome machine learning/vision material☆21Updated 8 years ago
- A tool to segment text based on frequencies and the Viterbi algorithm "#TheBoyWhoLived" => ['#', 'The', 'Boy', 'Who', 'Lived']☆82Updated 8 years ago
- SALM: Suffix Array and its Applications in Empirical Language Processing by Joy☆11Updated 7 years ago
- Implicit relation extractor using a natural language model.☆25Updated 6 years ago
- Awesome-Text-Classification Projects,Papers,Tutorial .☆170Updated 7 years ago
- ☆25Updated 6 years ago