Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).
☆676Jun 2, 2025Updated 9 months ago
Alternatives and similar repositories for ekphrasis
Users that are interested in ekphrasis are comparing it to the libraries listed below
Sorting:
- Deep-learning models of NTUA-SLP team submitted in SemEval 2018 tasks 1, 2 and 3.☆85Jun 21, 2022Updated 3 years ago
- Deep-learning model presented in "DataStories at SemEval-2017 Task 4: Deep LSTM with Attention for Message-level and Topic-based Sentimen…☆199Jun 8, 2018Updated 7 years ago
- A service for downloading twitter streaming data. You can save the data either in text files on disk, or in a database (MongoDB).☆23Dec 1, 2018Updated 7 years ago
- POS tagging models for Hindi English Code Mixed Tweets☆11Aug 1, 2018Updated 7 years ago
- Elegant and Easy Tweet Preprocessing in Python☆309Apr 17, 2023Updated 2 years ago
- Data augmentation for NLP☆4,645Jun 24, 2024Updated last year
- A python tool for evaluating the quality of sentence embeddings.☆2,106Mar 19, 2024Updated last year
- PyTorch source code of NAACL 2019 paper "An Embarrassingly Simple Approach for Transfer Learning from Pretrained Language Models"☆96Nov 2, 2023Updated 2 years ago
- A generic library for crafting adversarial NLP examples - WIP☆41Oct 26, 2018Updated 7 years ago
- A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coher…☆1,265Jul 24, 2025Updated 7 months ago
- NLP, before and after spaCy☆2,235Sep 22, 2023Updated 2 years ago
- A web application tagging and retrieval of arguments in text☆30May 1, 2023Updated 2 years ago
- A very simple framework for state-of-the-art Natural Language Processing (NLP)☆14,354Oct 27, 2025Updated 4 months ago
- Tokenizer for Twitter and Reddit data☆45Apr 14, 2019Updated 6 years ago
- Beyond Accuracy: Behavioral Testing of NLP models with CheckList☆2,050Jan 9, 2024Updated 2 years ago
- Fixes contractions such as `you're` to `you are`☆319Nov 15, 2022Updated 3 years ago
- ☆75Jul 2, 2021Updated 4 years ago
- A Python library for calculating a large variety of metrics from text☆360Jan 30, 2026Updated last month
- Guide for the slp group on how to use the Grnet cluster☆11Apr 16, 2020Updated 5 years ago
- Python Keyphrase Extraction module☆1,588Jul 12, 2023Updated 2 years ago
- An open-source NLP research library, built on PyTorch.☆11,889Nov 22, 2022Updated 3 years ago
- ☆55Mar 24, 2022Updated 3 years ago
- Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the mo…☆22,981Jul 28, 2024Updated last year
- A Python framework for sequence labeling evaluation(named-entity recognition, pos tagging, etc...)☆1,175Aug 28, 2024Updated last year
- 🧹 Python package for text cleaning☆1,002Jan 28, 2026Updated last month
- 🦆 Contextually-keyed word vectors☆1,673Apr 23, 2025Updated 10 months ago
- Python library for Natural Language Preprocessing (NLPre)☆192Jul 31, 2023Updated 2 years ago
- A Survey and Experiments on Annotated Corpora for Emotion Classification in Text☆234Apr 26, 2023Updated 2 years ago
- Twitter named entity extraction for WNUT 2016 http://noisy-text.github.io/2016/ner-shared-task.html☆140Aug 15, 2022Updated 3 years ago
- 🌊HMTL: Hierarchical Multi-Task Learning - A State-of-the-Art neural network model for several NLP tasks based on PyTorch and AllenNLP☆1,195Aug 1, 2023Updated 2 years ago
- Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.☆1,752Dec 20, 2023Updated 2 years ago
- InferSent sentence embeddings☆2,280Aug 30, 2021Updated 4 years ago
- Deep-learning Transfer Learning models of NTUA-SLP team submitted at the IEST of WASSA 2018 at EMNLP 2018.☆32Dec 27, 2022Updated 3 years ago
- The website for the CMU Language Technologies Institute low resource NLP bootcamp 2020☆606Jun 4, 2020Updated 5 years ago
- A fast, efficient universal vector embedding utility package.☆1,655Aug 3, 2023Updated 2 years ago
- Scikit-learn style model finetuning for NLP☆720Oct 21, 2025Updated 4 months ago
- SentAugment is a data augmentation technique for NLP that retrieves similar sentences from a large bank of sentences. It can be used in c…☆359Feb 22, 2022Updated 4 years ago
- Library for Knowledge Intensive Language Tasks☆967Mar 31, 2022Updated 3 years ago
- The tool to make NLP datasets ready to use☆241Oct 20, 2022Updated 3 years ago