cbaziotis / ekphrasisLinks
Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).
β673Updated 6 months ago
Alternatives and similar repositories for ekphrasis
Users that are interested in ekphrasis are comparing it to the libraries listed below
Sorting:
- π₯ Use the latest Stanza (StanfordNLP) research models directly in spaCyβ743Updated last year
- semi supervised guided topic model with custom guidedLDAβ512Updated 8 months ago
- Python Implementations of Word Sense Disambiguation (WSD) Technologies.β748Updated 3 years ago
- A Survey and Experiments on Annotated Corpora for Emotion Classification in Textβ234Updated 2 years ago
- BERTweet: A pre-trained language model for English Tweets (EMNLP-2020)β601Updated last year
- PyTorch deep learning models for document classificationβ596Updated 2 years ago
- EmbedRank: Unsupervised Keyphrase Extraction using Sentence Embeddings (official implementation)β438Updated 2 years ago
- Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?β528Updated last year
- Elegant and Easy Tweet Preprocessing in Pythonβ310Updated 2 years ago
- Catalog of abusive language data (PLoS 2020)β321Updated last year
- A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherβ¦β1,254Updated 5 months ago
- A framework to learn cross-lingual word embedding mappingsβ652Updated 2 years ago
- Repository for TweetEvalβ390Updated 3 years ago
- Text Similarityβ398Updated 5 years ago
- The SentiWordNet sentiment lexiconβ333Updated 3 years ago
- A collection of corpora for named entity recognition (NER) and entity recognition tasks. These annotated datasets cover a variety of langβ¦β1,555Updated 6 months ago
- β234Updated 8 years ago
- General purpose unsupervised sentence representationsβ1,207Updated 3 years ago
- A CoNLL-U parser that takes a CoNLL-U formatted string and turns it into a nested python dictionary.β319Updated 2 weeks ago
- A comparison and discussion of different NLP methods for 5-class sentiment classification on the SST-5 dataset.β172Updated 8 months ago
- This dataset contains 108,463 human-labeled and 656k noisily labeled pairs that feature the importance of modeling structure, context, anβ¦β561Updated 3 years ago
- Sentence paraphrase generation at the sentence levelβ408Updated 3 years ago
- Calculates Word Mover's Distance Insanely Fastβ462Updated 2 years ago
- Repository with all what is necessary for sentiment analysis and related areasβ542Updated 2 years ago
- Datasets to train supervised classifiers for Named-Entity Recognition in different languages (Portuguese, German, Dutch, French, English)β346Updated 3 years ago
- πΈ Use pretrained transformers like BERT, XLNet and GPT-2 in spaCyβ1,402Updated last month
- A python tool for evaluating the quality of sentence embeddings.β2,108Updated last year
- Compute Sentence Embeddings Fast!β624Updated 2 years ago
- Multilingual Rapid Automatic Keyword Extraction (RAKE) for Pythonβ272Updated 2 years ago
- End-to-end Neural Coreference Resolutionβ526Updated 3 years ago