OnlpLab / NEMO-Corpus
Named Entity (NER) annotations of the Hebrew Treebank (Haaretz newspaper) corpus, including: morpheme and token level NER labels, nested mentions, and more.
☆9Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for NEMO-Corpus
- ☆17Updated 3 months ago
- An NLP pipeline for Hebrew☆34Updated 7 months ago
- Repository for rstWeb, a browser based annotation interface for Rhetorical Structure Theory☆41Updated 3 weeks ago
- Poetry Corpora Annotated on Aesthetic Emotions☆11Updated 2 years ago
- Identifying Historical People, Places and other Entities: Shared Task on Named Entity Recognition and Linking on Historical Newspapers at…☆22Updated 3 months ago
- Neural Modeling for Named Entities and Morphology (Hebrew NER)☆30Updated last year
- List of corpora annotated for coreference for different languages☆17Updated 3 months ago
- A repository to keep tools, scripts, data for SMART task.☆11Updated 2 years ago
- GC4LM: A Colossal (Biased) language model for German☆13Updated 3 years ago
- An easy-to-use API for analyzing INCEpTION annotation projects.☆16Updated last year
- An implementation of GrASP (Shnarch et. al., 2017)☆21Updated 2 years ago
- several algorithms for converting dependency structures into constituency structures.☆9Updated 2 years ago
- A python module for evaluating NERC and NEL system performances as defined in the HIPE shared tasks (formerly CLEF-HIPE-2020-scorer).☆13Updated 5 months ago
- Dutch coreference resolution & dialogue analysis using deterministic rules☆21Updated last year
- A survey of corpora for Germanic low-resource languages and dialects☆24Updated 3 months ago
- An easy-to-use library to linguistically compare one sentence and its words to another, in the same language or a different one. For inst…☆21Updated 2 years ago
- A embed able annotation tool for end to end cross document co-reference☆41Updated last year
- English web corpus with 4M tokens and several annotation types☆25Updated last year
- BERT and ELECTRA models trained on Europeana Newspapers☆36Updated 2 years ago
- ☆33Updated last year
- MAGPIE: A sense-annotated corpus of potentially idiomatic expressions☆25Updated 4 years ago
- Efficient-Sentence-Embedding-using-Discrete-Cosine-Transform☆17Updated 4 years ago
- A field-tested Hebrew tokenizer for dirty texts (ben-yehuda project, bible, cc100, mc4, opensubs, oscar, twitter) focused on multi-word e…☆21Updated 2 years ago
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Do…☆76Updated 4 months ago
- BabelNet (and WordNet) sense embedding trained with Word2Vec and FastText☆10Updated 5 years ago
- Data for the HIPE 2022 shared task.☆16Updated 11 months ago
- A small repository to test Captum Explainable AI with a trained Flair transformers-based text classifier.☆26Updated 3 years ago
- [COLING2020] A challenge dataset for Person SenTiment analysis in news domain.☆10Updated 2 years ago
- CrossRE: A Cross-Domain Dataset for Relation Extraction (Findings of EMNLP 2022)☆47Updated 3 months ago
- The Universal Decompositional Semantics (UDS) dataset and the Decomp toolkit☆56Updated last year