kohjiaxuan / NLP-Model-for-Corpus-Similarity
A NLP algorithm I developed to determine the similarity or relation between two documents/Wikipedia articles. Inspired by the cosine similarity algorithm and built from WordNet.
β9Updated 4 years ago
Alternatives and similar repositories for NLP-Model-for-Corpus-Similarity:
Users that are interested in NLP-Model-for-Corpus-Similarity are comparing it to the libraries listed below
- Text processing library for sentiment analysis and related tasksβ27Updated 6 years ago
- πNatural language processing (NLP) utils: word embeddings (Word2Vec, GloVe, FastText, ...) and preprocessing transformers, compatible wiβ¦β62Updated last year
- Fine-tune transformers with pytorch-lightningβ44Updated 2 years ago
- fastText Quick Start Guide, published by Packtβ49Updated last year
- Rank-based Unsupervised Keyword Extraction via Metavertex Aggregationβ99Updated 2 months ago
- Differnable Readability Measure Regularizer for Neural Network Automatic Text Simplificationβ24Updated last year
- General-Purpose Neural Networks for Sentence Boundary Detectionβ73Updated last year
- Neural Network for Automatic Negation Detectionβ20Updated 8 years ago
- β15Updated 5 years ago
- Detect common phrases in large amounts of text using a data-driven approach. Size of discovered phrases can be arbitrary. Can be used in β¦β125Updated 5 years ago
- β¨ Web interface for NeuralCoref coreference resolutionβ34Updated last year
- Tutorial for Topic Modelling using PySpark and Spark NLPβ16Updated 4 years ago
- A simple neural truecaser written in pytorch and allennlp.β32Updated 7 months ago
- Regex like pattern tree matching but on sentence's tree instead of Stringsβ42Updated 6 years ago
- π€ Calculate average word embeddings (word2vec) from documents for transfer learningβ54Updated 8 months ago
- Dict2vec is a framework to learn word embeddings using lexical dictionaries.β114Updated 4 years ago
- WordMoversEmbeddings(WME) is a simple code for generating the vector representation of sentence/document for text classification and clusβ¦β81Updated 6 years ago
- Word Sense Induction with BERT MLMβ28Updated last year
- A collection of English tweets annotated in Universal Dependencies.β39Updated 3 years ago
- A python 3 interface for BabelNet https://babelnet.org/β31Updated last year
- Automatic labeling for topic modelβ57Updated 9 years ago
- Code and model files for paper: I. Lourentzou et al., Adapting Sequence to Sequence models for Text Normalization in Social Media", ICWSMβ¦β36Updated 3 years ago
- N-gram Extraction Approaches (bigrams, trigrams)β42Updated 6 years ago
- Multi lingual character based named entity recognizerβ25Updated 6 years ago
- NER, syntax markup visualizationsβ136Updated last year
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Doβ¦β78Updated 6 months ago
- A Large Automatically-Constructed Resource of Predicate Paraphrasesβ43Updated 4 years ago
- Tool for parsing and converting various span encoding schemes.β22Updated last year
- spaCy + UDPipeβ161Updated 2 years ago