cbaziotis / ekphrasisLinks
Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).
β672Updated 5 months ago
Alternatives and similar repositories for ekphrasis
Users that are interested in ekphrasis are comparing it to the libraries listed below
Sorting:
- semi supervised guided topic model with custom guidedLDAβ511Updated 6 months ago
- π₯ Use the latest Stanza (StanfordNLP) research models directly in spaCyβ738Updated last year
- BERTweet: A pre-trained language model for English Tweets (EMNLP-2020)β599Updated last year
- EmbedRank: Unsupervised Keyphrase Extraction using Sentence Embeddings (official implementation)β438Updated 2 years ago
- Repository for TweetEvalβ386Updated 3 years ago
- Elegant and Easy Tweet Preprocessing in Pythonβ309Updated 2 years ago
- Python Keyphrase Extraction moduleβ1,584Updated 2 years ago
- GSDMM: Short text clusteringβ357Updated 2 years ago
- Text Similarityβ398Updated 5 years ago
- Compute Sentence Embeddings Fast!β623Updated 2 years ago
- TextRank implementation for Python 3.β1,267Updated 2 years ago
- Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?β527Updated last year
- PyTorch deep learning models for document classificationβ595Updated 2 years ago
- A sentence segmenter that actually works!β304Updated 5 years ago
- A collection of corpora for named entity recognition (NER) and entity recognition tasks. These annotated datasets cover a variety of langβ¦β1,553Updated 5 months ago
- Repository with all what is necessary for sentiment analysis and related areasβ541Updated 2 years ago
- LexRank algorithm for text summarizationβ231Updated last year
- A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherβ¦β1,250Updated 3 months ago
- General purpose unsupervised sentence representationsβ1,204Updated 3 years ago
- The SentiWordNet sentiment lexiconβ332Updated 3 years ago
- Datasets to train supervised classifiers for Named-Entity Recognition in different languages (Portuguese, German, Dutch, French, English)β346Updated 3 years ago
- Data repository for pretrained NLP models and NLP corpora.β1,039Updated 7 years ago
- Annotated dataset of 100 works of fiction to support tasks in natural language processing and the computational humanities.β365Updated 2 years ago
- β235Updated 8 years ago
- Steam review texting embedding analysisβ143Updated 2 years ago
- Multilingual Rapid Automatic Keyword Extraction (RAKE) for Pythonβ272Updated 2 years ago
- πΈ Use pretrained transformers like BERT, XLNet and GPT-2 in spaCyβ1,401Updated this week
- A Survey and Experiments on Annotated Corpora for Emotion Classification in Textβ235Updated 2 years ago
- A curated list of resources dedicated to text summarizationβ1,542Updated 2 years ago
- Topic Modeling in Embedding Spacesβ560Updated 2 years ago