jogonba2 / twilbertLinks
Specialization of BERT architecture both for the Spanish language and the Twitter domain
☆13Updated 4 years ago
Alternatives and similar repositories for twilbert
Users that are interested in twilbert are comparing it to the libraries listed below
Sorting:
- Ready to use Spanish Word2Vec embeddings created from >18B chars and >3B words☆42Updated 6 years ago
- Official source for spanish Language Models and resources made @ BSC-TEMU within the "Plan de las Tecnologías del Lenguaje" (Plan-TL).☆259Updated last year
- Spanish Billion Word Corpus and Embeddings☆48Updated 2 years ago
- BERT and ELECTRA models trained on Europeana Newspapers☆38Updated 3 years ago
- Lista de corpus de PLN en español ✨ #Somos600M: Ayuda a desarrollar IA inclusiva que entienda las diferentes variedades de nuestras lengu…☆22Updated last year
- A data set and model for german sentiment classification.☆67Updated 3 weeks ago
- A pre-trained language model for social media text in Spanish☆35Updated 2 years ago
- Ten Thousand German News Articles Dataset for Topic Classification☆84Updated 2 years ago
- A python package to enrich Twitter Data☆75Updated 2 years ago
- WEFE: The Word Embeddings Fairness Evaluation Framework. WEFE is a framework that standardizes the bias measurement and mitigation in Wor…☆177Updated last year
- A french sequence to sequence pretrained model☆61Updated 2 years ago
- Jojajovai Guarani-Spanish Parallel Corpus☆15Updated 2 years ago
- Unannotated Spanish 3 Billion Words Corpora☆101Updated 2 years ago
- Spanish word embeddings computed with different methods and from different corpora☆359Updated 5 years ago
- Multilingual toolkit for NLP: dependency parser, PoS tagger, NERC, multiword extractor, sentiment analysis, etc.☆66Updated last year
- Official repository of the Hate Speech Detection Tasks at Evalita☆12Updated 4 years ago
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Do…☆80Updated 11 months ago
- BETO - Spanish version of the BERT model☆497Updated last year
- Anonymization of legal cases (Fr) based on Flair embeddings☆88Updated 4 years ago
- ☆63Updated 2 years ago
- Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2…☆67Updated 2 years ago
- ☆22Updated 5 years ago
- Simple customizable pipeline tool for anonymizing Danish text.☆10Updated 9 months ago
- Multilingual Emotion classification using BERT (fine-tuning). Published at the WASSA workshop (ACL2022).☆8Updated 2 years ago
- ☆64Updated 2 years ago
- ALBETO and DistilBETO are versions of ALBERT and DistilBERT pre-trained exclusively on Spanish corpora.☆37Updated 2 years ago
- Augmenty is an augmentation library based on spaCy for augmenting texts.☆156Updated last year
- Full named-entity (i.e., not tag/token) evaluation metrics based on SemEval’13☆180Updated 2 weeks ago
- Annotated corpus + evaluation metrics for text anonymisation☆57Updated last year
- This is a german ELMo deep contextualized word representation. It is trained on a special German Wikipedia Text Corpus.☆28Updated 5 years ago