cbaziotis / ekphrasis
Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).
☆664Updated 11 months ago
Alternatives and similar repositories for ekphrasis:
Users that are interested in ekphrasis are comparing it to the libraries listed below
- Calculates Word Mover's Distance Insanely Fast☆460Updated last year
- Pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE)☆1,194Updated 4 months ago
- EmbedRank: Unsupervised Keyphrase Extraction using Sentence Embeddings (official implementation)☆434Updated last year
- A framework to learn cross-lingual word embedding mappings☆648Updated last year
- 💥 Use the latest Stanza (StanfordNLP) research models directly in spaCy☆730Updated 6 months ago
- BERTweet: A pre-trained language model for English Tweets (EMNLP-2020)☆584Updated 6 months ago
- semi supervised guided topic model with custom guidedLDA☆502Updated 4 years ago
- General purpose unsupervised sentence representations☆1,200Updated 2 years ago
- Tensorflow implementation of contextualized word representations from bi-directional language models☆1,618Updated last year
- PyTorch deep learning models for document classification☆593Updated last year
- Compute Sentence Embeddings Fast!☆618Updated last year
- Deep-learning model presented in "DataStories at SemEval-2017 Task 4: Deep LSTM with Attention for Message-level and Topic-based Sentimen…☆197Updated 6 years ago
- Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?☆514Updated 3 months ago
- Evaluating Cross-lingual Sentence Representations☆449Updated 3 years ago
- Datasets to train supervised classifiers for Named-Entity Recognition in different languages (Portuguese, German, Dutch, French, English)☆340Updated 2 years ago
- Pre-trained ELMo Representations for Many Languages☆1,462Updated 3 years ago
- Data repository for pretrained NLP models and NLP corpora.☆1,001Updated 6 years ago
- LexRank algorithm for text summarization☆230Updated 10 months ago
- Package for evaluating word embeddings☆436Updated 4 years ago
- Super easy library for BERT based NLP models☆1,881Updated 6 months ago
- Python Implementations of Word Sense Disambiguation (WSD) Technologies.☆746Updated 2 years ago
- Utilizing BERT for Aspect-Based Sentiment Analysis via Constructing Auxiliary Sentence (NAACL 2019)☆506Updated 3 years ago
- Python Keyphrase Extraction module☆1,577Updated last year
- Elegant and Easy Tweet Preprocessing in Python☆305Updated last year
- A CoNLL-U parser that takes a CoNLL-U formatted string and turns it into a nested python dictionary.☆312Updated last month
- Fixes contractions such as `you're` to `you are`☆315Updated 2 years ago
- Retrofitting Word Vectors to Semantic Lexicons☆375Updated 5 years ago
- InferSent sentence embeddings☆2,285Updated 3 years ago
- 🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy☆1,362Updated 2 weeks ago
- TextRank implementation for Python 3.☆1,249Updated last year