cbaziotis/ekphrasis

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/cbaziotis/ekphrasis)

cbaziotis / ekphrasis

Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).

☆675

Alternatives and similar repositories for ekphrasis

Users that are interested in ekphrasis are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

cbaziotis / datastories-semeval2017-task4
View on GitHub
Deep-learning model presented in "DataStories at SemEval-2017 Task 4: Deep LSTM with Attention for Message-level and Topic-based Sentimen…
☆199Jun 8, 2018Updated 8 years ago
cbaziotis / ntua-slp-semeval2018
View on GitHub
Deep-learning models of NTUA-SLP team submitted in SemEval 2018 tasks 1, 2 and 3.
☆85Jun 21, 2022Updated 4 years ago
georgepar / grnet_guide
View on GitHub
Guide for the slp group on how to use the Grnet cluster
☆11Apr 16, 2020Updated 6 years ago
cbaziotis / twitter-stream-downloader
View on GitHub
A service for downloading twitter streaming data. You can save the data either in text files on disk, or in a database (MongoDB).
☆23Dec 1, 2018Updated 7 years ago
s / preprocessor
View on GitHub
Elegant and Easy Tweet Preprocessing in Python
☆310Apr 17, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
alexandra-chron / hierarchical-domain-adaptation
View on GitHub
Code of NAACL 2022 "Efficient Hierarchical Domain Adaptation for Pretrained Language Models" paper.
☆32Sep 26, 2023Updated 2 years ago
alexandra-chron / siatl
View on GitHub
PyTorch source code of NAACL 2019 paper "An Embarrassingly Simple Approach for Transfer Learning from Pretrained Language Models"
☆96Nov 2, 2023Updated 2 years ago
alexandra-chron / ntua-slp-wassa-iest2018
View on GitHub
Deep-learning Transfer Learning models of NTUA-SLP team submitted at the IEST of WASSA 2018 at EMNLP 2018.
☆32Dec 27, 2022Updated 3 years ago
precog-iiith / hindi-english-code-mixed-POS-tagging
View on GitHub
POS tagging models for Hindi English Code Mixed Tweets
☆11Aug 1, 2018Updated 7 years ago
flairNLP / flair
View on GitHub
A very simple framework for state-of-the-art Natural Language Processing (NLP)
☆14,381Oct 27, 2025Updated 9 months ago
georgepar / optimistic-adam
View on GitHub
PyTorch implementation of Optimistic Adam proposed in Training GANs with Optimism (https://arxiv.org/pdf/1711.00141.pdf)
☆20Jan 16, 2021Updated 5 years ago
jwieting / paraphrastic-representations-at-scale
View on GitHub
☆74Jul 2, 2021Updated 5 years ago
chartbeat-labs / textacy
View on GitHub
NLP, before and after spaCy
☆2,239Sep 22, 2023Updated 2 years ago
facebookresearch / SentEval
View on GitHub
A python tool for evaluating the quality of sentence embeddings.
☆2,110Mar 19, 2024Updated 2 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
FredericGodin / TwitterEmbeddings
View on GitHub
Twitter word embeddings generated using Word2Vec and FastText.
☆47Aug 17, 2019Updated 6 years ago
makcedward / nlpaug
View on GitHub
Data augmentation for NLP
☆4,663Updated this week
MilaNLProc / contextualized-topic-models
View on GitHub
A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coher…
☆1,271Jul 24, 2025Updated last year
kootenpv / contractions
View on GitHub
Fixes contractions such as `you're` to `you are`
☆318Nov 15, 2022Updated 3 years ago
marcotcr / checklist
View on GitHub
Beyond Accuracy: Behavioral Testing of NLP models with CheckList
☆2,052Jan 9, 2024Updated 2 years ago
VinAIResearch / BERTweet
View on GitHub
BERTweet: A pre-trained language model for English Tweets (EMNLP-2020)
☆609Jul 22, 2024Updated 2 years ago
erikavaris / tokenizer
View on GitHub
Tokenizer for Twitter and Reddit data
☆45Apr 14, 2019Updated 7 years ago
cbaziotis / neat-vision
View on GitHub
Neat (Neural Attention) Vision, is a visualization tool for the attention mechanisms of deep-learning models for Natural Language Process…
☆251May 4, 2018Updated 8 years ago
sebastianruder / NLP-progress
View on GitHub
Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the mo…
☆22,955Jul 28, 2024Updated 2 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
allenai / allennlp
View on GitHub
An open-source NLP research library, built on PyTorch.
☆11,889Nov 22, 2022Updated 3 years ago
chakki-works / seqeval
View on GitHub
A Python framework for sequence labeling evaluation(named-entity recognition, pos tagging, etc...)
☆1,185Aug 28, 2024Updated last year
boudinfl / pke
View on GitHub
Python Keyphrase Extraction module
☆1,589Jul 12, 2023Updated 3 years ago
plasticityai / magnitude
View on GitHub
A fast, efficient universal vector embedding utility package.
☆1,665Aug 3, 2023Updated 2 years ago
alexandra-chron / lexical_xlm_relm
View on GitHub
PyTorch source code of NAACL 2021 paper "Improving the Lexical Ability of Pretrained Language Models for Unsupervised Neural Machine Tran…
☆18Oct 18, 2022Updated 3 years ago
napsternxg / TwitterNER
View on GitHub
Twitter named entity extraction for WNUT 2016 http://noisy-text.github.io/2016/ner-shared-task.html
☆140Aug 15, 2022Updated 3 years ago
NIHOPA / NLPre
View on GitHub
Python library for Natural Language Preprocessing (NLPre)
☆190Jul 31, 2023Updated 2 years ago
ENCASEH2020 / hatespeech-twitter
View on GitHub
☆55Mar 24, 2022Updated 4 years ago
artetxem / vecmap
View on GitHub
A framework to learn cross-lingual word embedding mappings
☆656Apr 22, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
facebookresearch / SentAugment
View on GitHub
SentAugment is a data augmentation technique for NLP that retrieves similar sentences from a large bank of sentences. It can be used in c…
☆359Feb 22, 2022Updated 4 years ago
HLasse / TextDescriptives
View on GitHub
A Python library for calculating a large variety of metrics from text
☆366May 5, 2026Updated 2 months ago
huggingface / adversarialnlp
View on GitHub
A generic library for crafting adversarial NLP examples - WIP
☆42Oct 26, 2018Updated 7 years ago
neubig / lowresource-nlp-bootcamp-2020
View on GitHub
The website for the CMU Language Technologies Institute low resource NLP bootcamp 2020
☆607Jun 4, 2020Updated 6 years ago
jfilter / clean-text
View on GitHub
🧹 Python package for text cleaning
☆1,026May 15, 2026Updated 2 months ago
explosion / sense2vec
View on GitHub
🦆 Contextually-keyed word vectors
☆1,678Mar 27, 2026Updated 4 months ago
facebookresearch / InferSent
View on GitHub
InferSent sentence embeddings
☆2,280Aug 30, 2021Updated 4 years ago