ZurichNLP / swissbertLinks
The multilingual language model for Switzerland
☆28Updated last year
Alternatives and similar repositories for swissbert
Users that are interested in swissbert are comparing it to the libraries listed below
Sorting:
- Repository for the paper "MultiNERD: A Multilingual, Multi-Genre and Fine-Grained Dataset for Named Entity Recognition (and Disambiguatio…☆45Updated last year
- A software for transferring pre-trained English models to foreign languages☆19Updated 2 years ago
- Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2…☆69Updated 2 years ago
- Code accompanying the submission "Structural Text Segmentation of Legal Documents" by Aumiller et al.☆98Updated 2 years ago
- Code and models used in "MUSS Multilingual Unsupervised Sentence Simplification by Mining Paraphrases".☆99Updated 2 years ago
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆62Updated last year
- T-Projection is a method to perform high-quality Annotation Projection of Sequence Labeling datasets.☆13Updated 2 years ago
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Do…☆80Updated last year
- BLOOM+1: Adapting BLOOM model to support a new unseen language☆74Updated last year
- Automatically detect errors in annotated corpora.☆47Updated 2 years ago
- Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.☆104Updated 3 years ago
- An easy-to-use API for analyzing INCEpTION annotation projects.☆17Updated 2 years ago
- A module to compute textual lexical richness (aka lexical diversity).☆110Updated 2 years ago
- A multi-lingual approach to AllenNLP CoReference Resolution along with a wrapper for spaCy.☆108Updated last year
- TimeLMs: Diachronic Language Models from Twitter☆111Updated last year
- This repository contains a demonstrative implementation for pooling-based models, e.g., DeepPyramidion complementing our paper "Sparsifyi…☆14Updated 3 years ago
- NTREX -- News Test References for MT Evaluation☆86Updated last year
- Full named-entity (i.e., not tag/token) evaluation metrics based on SemEval’13☆198Updated 2 months ago
- MAGPIE: A sense-annotated corpus of potentially idiomatic expressions☆28Updated 5 years ago
- Python-based implementation of the Translate-Align-Retrieve method to automatically translate the SQuAD Dataset to Spanish.☆59Updated 2 years ago
- KIND: an Italian Multi-Domain Dataset for Named Entity Recognition☆15Updated 2 years ago
- A survey of corpora for Germanic low-resource languages and dialects☆26Updated 11 months ago
- ☆27Updated 8 months ago
- Lexical Simplification with Pretrained Encoders☆70Updated 4 years ago
- Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning☆30Updated 2 years ago
- Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.☆85Updated last year
- Semantically Structured Sentence Embeddings☆69Updated last year
- ☆171Updated last year
- Repository for Vajjala & Lucic (2018)☆66Updated last year
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.☆96Updated 2 years ago