somosnlp / corpus-es
Lista de corpus de PLN en español ✨ #Somos600M: Ayuda a desarrollar IA inclusiva que entienda las diferentes variedades de nuestras lenguas ✨ English-speaking contributors welcome!
☆17Updated 8 months ago
Related projects: ⓘ
- A pre-trained language model for social media text in Spanish☆34Updated last year
- Official source for spanish Language Models and resources made @ BSC-TEMU within the "Plan de las Tecnologías del Lenguaje" (Plan-TL).☆249Updated last year
- Curso práctico: NLP de cero a cien 🤗☆182Updated 5 months ago
- Ready to use Spanish Word2Vec embeddings created from >18B chars and >3B words☆38Updated 5 years ago
- Unannotated Spanish 3 Billion Words Corpora☆91Updated last year
- Página web de Somos NLP 🤗 ¡Publica en nuestro blog!☆22Updated 2 weeks ago
- Ejercicios para aprender a hacer NLP impulsado por las librerías de Hugging Face.☆23Updated 2 years ago
- Spanish word embeddings computed with different methods and from different corpora☆353Updated 4 years ago
- spanlp: nlp applied for spanish vulgarity. A fast, robust Python library to check for profanity or offensive language in Spanish strings. …☆35Updated 3 months ago
- ☆44Updated 2 years ago
- ☆23Updated 3 years ago
- Specialization of BERT architecture both for the Spanish language and the Twitter domain☆13Updated 3 years ago
- Spanish data from the AnCora corpus.☆28Updated 3 months ago
- A Python multilingual toolkit for Sentiment Analysis and Social NLP tasks☆542Updated 2 months ago
- WEFE: The Word Embeddings Fairness Evaluation Framework. WEFE is a framework that standardizes the bias measurement and mitigation in Wor…☆173Updated 3 months ago
- Official source for Spanish pretrained biomedical and clinical language models and resources made @ BSC-TEMU within the "Plan de las Tecn…☆25Updated last year
- BETO - Spanish version of the BERT model☆488Updated 10 months ago
- Spanish Billion Word Corpus and Embeddings☆45Updated last year
- List of research and engineering of NLP for American Native/Indigenous Languages.☆87Updated 3 years ago
- Spanish rule-based lemmatization for spaCy☆37Updated 2 years ago
- A monolingual and cross-lingual meta-embedding generation and evaluation framework☆79Updated 2 years ago
- Diapositivas, notebooks y material de charlas, talleres y el grupo de estudio☆12Updated 4 months ago
- ☆61Updated last year
- The Spanish Fake News Corpus contains a collection of 971 news divided into 491 real news and 480 fake news. The corpus covers news from …☆37Updated 3 years ago
- Material para el taller "Representaciones vectoriales de palabras basadas en redes neuronales" de la Starsconf 2018☆23Updated 5 years ago
- Explainable Zero-Shot Topic Extraction☆62Updated last month
- ☆32Updated last year
- Curated list of Linguistic Resources for doing NLP & CL on Spanish☆329Updated 8 months ago
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Do…☆72Updated 2 months ago
- [LREC 2022] An off-the-shelf pre-trained Tweet NLP Toolkit (NER, tokenization, lemmatization, POS tagging, dependency parsing) + Tweeban…☆101Updated 7 months ago