somosnlp / corpus-esLinks
Lista de corpus de PLN en español ✨ #Somos600M: Ayuda a desarrollar IA inclusiva que entienda las diferentes variedades de nuestras lenguas ✨ English-speaking contributors welcome!
☆22Updated last year
Alternatives and similar repositories for corpus-es
Users that are interested in corpus-es are comparing it to the libraries listed below
Sorting:
- Official source for spanish Language Models and resources made @ BSC-TEMU within the "Plan de las Tecnologías del Lenguaje" (Plan-TL).☆260Updated 2 years ago
- Unannotated Spanish 3 Billion Words Corpora☆104Updated 2 years ago
- WEFE: The Word Embeddings Fairness Evaluation Framework. WEFE is a framework that standardizes the bias measurement and mitigation in Wor…☆179Updated last month
- Spanish word embeddings computed with different methods and from different corpora☆359Updated 5 years ago
- Curso práctico: NLP de cero a cien 🤗☆189Updated last year
- Topic modeling helpers using managed language models from Cohere. Name text clusters using large GPT models.☆223Updated 2 years ago
- BETO - Spanish version of the BERT model☆499Updated last year
- Clustering sentence embeddings to extract message intent☆175Updated 3 years ago
- A pre-trained language model for social media text in Spanish☆35Updated 2 years ago
- A monolingual and cross-lingual meta-embedding generation and evaluation framework☆80Updated 3 years ago
- ☆24Updated 2 years ago
- A Python multilingual toolkit for Sentiment Analysis and Social NLP tasks☆606Updated last year
- SpanMarker for Named Entity Recognition☆451Updated 7 months ago
- A Python library aimed at dissecting and augmenting NER training data.☆58Updated 2 years ago
- ☆168Updated last year
- ☆45Updated 3 years ago
- ☆40Updated 4 months ago
- This repository contains an easy and intuitive approach to few-shot NER using most similar expansion over spaCy embeddings. Now with enti…☆245Updated 2 years ago
- Quote extraction for modular journalism (JournalismAI collab 2021)☆230Updated 3 years ago
- Spanish Billion Word Corpus and Embeddings☆48Updated 2 years ago
- A python package for benchmarking interpretability techniques on Transformers.☆214Updated 11 months ago
- ✨ Bootstrap annotation with zero- & few-shot learning via OpenAI GPT-3☆323Updated 2 years ago
- Creating class-based TF-IDF matrices☆90Updated 2 years ago
- Full named-entity (i.e., not tag/token) evaluation metrics based on SemEval’13☆186Updated this week
- ☆82Updated 2 years ago
- A spaCy custom component that extracts and normalizes temporal expressions☆55Updated 2 years ago
- This repository contains an easy and intuitive approach to use SetFit in combination with spaCy.☆80Updated 2 years ago
- Ready to use Spanish Word2Vec embeddings created from >18B chars and >3B words☆44Updated 6 years ago
- Generate synthetic labeled data for extremely low-resource languages using bilingual lexicons.☆17Updated 10 months ago
- List of research and engineering of NLP for American Native/Indigenous Languages.☆92Updated 4 years ago