makrai / notesLinks
Notes on papers in Natural Language Processing, Computational Linguistics, and the related sciences
☆14Updated last week
Alternatives and similar repositories for notes
Users that are interested in notes are comparing it to the libraries listed below
Sorting:
- Featurize words into orthographic and phonological vectors.☆41Updated 2 years ago
- AfriSenti-SemEval Shared Task 12: Sentiment Analysis for African languages : https://afrisenti-semeval.github.io/☆49Updated 2 years ago
- ☆22Updated 2 years ago
- ☆15Updated 7 years ago
- Training Temporal Word Embeddings with a Compass☆65Updated 5 months ago
- Röttger et al. (WOAH at NAACL 2022): "Multilingual HateCheck: Functional Tests for Multilingual Hate Speech Detection Models"☆17Updated 3 years ago
- Dataset and code of our EMNLP 2019 paper "Multilingual and Multi-Aspect Hate Speech Analysis"☆58Updated last year
- A multilingual lexicon of words to hurt.☆94Updated 3 months ago
- An unsupervised method for target-specific stance detection using embeddings-based clustering and projection techniques. Achieves 90% pre…☆24Updated 10 months ago
- ☆55Updated 3 years ago
- Code for the paper "Simple, Interpretable and Stable Method for Detecting Words with Usage Change across Corpora", ACL 2020.☆18Updated 5 years ago
- ☆54Updated 4 years ago
- Unsupervised method for extracting quotation-speaker pairs from large news corpora.☆29Updated 7 years ago
- ☆64Updated 3 years ago
- ☆35Updated last year
- Repository for code and metadata to support work described in "Authorless Topic Models: Biasing Models Away from Known Structure"☆29Updated 5 years ago
- XED multilingual emotion datasets☆64Updated 2 years ago
- spaCy + UDPipe☆166Updated 3 years ago
- An example of how to use spaCy for extremely large files without running into memory issues☆36Updated 3 years ago
- Code for FACTOID dataset paper in LREC 2022☆18Updated 3 years ago
- ☆18Updated 4 years ago
- A data set regarding news veracity on social media. Published at ICWSM-18.☆36Updated 4 years ago
- A Super-Lightweight Annotation Tool for Experts: Label text in a terminal with just Python☆112Updated last month
- ☆11Updated 6 years ago
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Do…☆81Updated last year
- REMERGE - Multi-Word Expression discovery algorithm☆14Updated 3 years ago
- AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.☆35Updated this week
- This repository contains papers and resources pertaining to Hate speech research.☆44Updated 4 years ago
- Tokenizer for Twitter and Reddit data☆45Updated 6 years ago
- Learning BPE embeddings by first learning a segmentation model and then training word2vec☆19Updated 3 years ago