BramVanroy / spacy_download
Download and load spaCy models on-the-fly
☆15Updated 2 years ago
Alternatives and similar repositories for spacy_download:
Users that are interested in spacy_download are comparing it to the libraries listed below
- BERT models for many languages created from Wikipedia texts☆33Updated 4 years ago
- An easy-to-use library to linguistically compare one sentence and its words to another, in the same language or a different one. For inst…☆22Updated 3 years ago
- REMERGE - Multi-Word Expression discovery algorithm☆14Updated 2 years ago
- A simple neural truecaser written in pytorch and allennlp.☆33Updated 9 months ago
- classy is a simple-to-use library for building high-performance Machine Learning models in NLP.☆86Updated 2 months ago
- As good as new. How to successfully recycle English GPT-2 to make models for other languages (ACL Findings 2021)☆48Updated 3 years ago
- Tool to fix bitexts and tag near-duplicates for removal☆30Updated last month
- ☆17Updated last year
- A spaCy custom component that extracts and normalizes temporal expressions☆54Updated 2 years ago
- Code for SaGe subword tokenizer (EACL 2023)☆24Updated 4 months ago
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Do…☆80Updated 9 months ago
- Source code and data for Like a Good Nearest Neighbor☆28Updated 2 months ago
- GC4LM: A Colossal (Biased) language model for German☆13Updated 3 years ago
- MAMMOTH: MAssively Multilingual Modular Open Translation @ Helsinki☆22Updated last month
- Make the Best of Cross-lingual Transfer: Evidence from POS Tagging with over 100 Languages (ACL 2022)☆19Updated 2 years ago
- ☆21Updated 2 months ago
- Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.☆80Updated 6 months ago
- An implementation of GrASP (Shnarch et. al., 2017)☆21Updated 2 years ago
- These are lists for a variety of languages containing words that are distinctive to each language.☆37Updated 2 years ago
- Tool for parsing and converting various span encoding schemes.☆23Updated last year
- KIND: an Italian Multi-Domain Dataset for Named Entity Recognition☆15Updated last year
- Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2…☆67Updated 2 years ago
- BERT and ELECTRA models trained on Europeana Newspapers☆37Updated 3 years ago
- Generate BERT vocabularies and pretraining examples from Wikipedias☆18Updated 4 years ago
- ☆26Updated last month
- Repository with code for MaChAmp: https://aclanthology.org/2021.eacl-demos.22/☆85Updated this week
- Statistics on multilingual datasets☆17Updated 2 years ago
- Repo for the LREC 2022 paper The Project Dialogism Novel Corpus: A Dataset for Quotation Attribution in Literary Texts.☆13Updated 2 years ago
- Crawling engine that crawls a set of top-level domains looking for documents in a list of languages☆10Updated last year
- Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures Translation☆14Updated 7 months ago