dccuchile / GLUES
Resources for GLUE benchmark in Spanish
☆15Updated 4 years ago
Alternatives and similar repositories for GLUES:
Users that are interested in GLUES are comparing it to the libraries listed below
- A Benchmark Dataset for Understanding Disfluencies in Question Answering☆62Updated 3 years ago
- Morfessor EM+Prune☆10Updated 4 years ago
- An asynchronous concurrent pipeline for classifying Common Crawl based on fastText's pipeline.☆86Updated 4 years ago
- classy is a simple-to-use library for building high-performance Machine Learning models in NLP.☆87Updated 2 weeks ago
- Code and models used in "MUSS Multilingual Unsupervised Sentence Simplification by Mining Paraphrases".☆99Updated 2 years ago
- As good as new. How to successfully recycle English GPT-2 to make models for other languages (ACL Findings 2021)☆48Updated 3 years ago
- German small and large versions of GPT2.☆20Updated 2 years ago
- A simple neural truecaser written in pytorch and allennlp.☆33Updated 10 months ago
- Semantically Structured Sentence Embeddings☆66Updated 6 months ago
- ☆47Updated 9 months ago
- Unannotated Spanish 3 Billion Words Corpora☆101Updated 2 years ago
- A collection of scripts to preprocess ASR datasets and finetune language-specific Wav2Vec2 XLSR models☆31Updated 4 years ago
- Most basic AI Assistant demo derived from the DeepPavlov Dream AI Assistant.☆13Updated last year
- ☆64Updated 2 years ago
- OpusCleaner is a web interface that helps you select, clean and schedule your data for training machine translation models.☆51Updated 3 months ago
- NTREX -- News Test References for MT Evaluation☆83Updated 10 months ago
- Using short models to classify long texts☆21Updated 2 years ago
- A tiny BERT for low-resource monolingual models☆31Updated 6 months ago
- spaCy match and replace, maintaining conjugation☆35Updated 2 years ago
- Many Natural Language Processing tasks rely on sentence boundary detection (SBD). Although amazing libraries like spacy provide state of …☆61Updated 4 years ago
- BLOOM+1: Adapting BLOOM model to support a new unseen language☆71Updated last year
- Consists of the largest (10K) human annotated code-switched semantic parsing dataset & 170K generated utterance using the CST5 augmentati…☆39Updated 2 years ago
- 💥 Use Hugging Face text and token classification pipelines directly in spaCy☆63Updated last year
- Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.☆102Updated 2 years ago
- BERT models for many languages created from Wikipedia texts☆33Updated 4 years ago
- XtremeDistil framework for distilling/compressing massive multilingual neural network models to tiny and efficient models for AI at scale☆154Updated last year
- Bicleaner fork that uses neural networks☆40Updated 9 months ago
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆22Updated 2 years ago
- A guide to building language technology in new languages.☆58Updated 3 years ago
- This repo contains a set of neural transducer, e.g. sequence-to-sequence model, focusing on character-level tasks.☆74Updated last year