ZurichNLP / swissbert
The multilingual language model for Switzerland
☆25Updated 10 months ago
Related projects ⓘ
Alternatives and complementary repositories for swissbert
- A survey of corpora for Germanic low-resource languages and dialects☆24Updated 3 months ago
- A spaCy custom component that extracts and normalizes temporal expressions☆52Updated last year
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆53Updated 3 months ago
- GEMBA — GPT Estimation Metric Based Assessment☆102Updated 3 months ago
- ☆26Updated last month
- A High-level Library for Named Entity Recognition in Python.☆22Updated 11 months ago
- A software for transferring pre-trained English models to foreign languages☆18Updated last year
- CD20200004 from 01/01/2021 to 31/12/2023 - LIG UGA - Python Notebook and Models for the MT Lab @ ALPS 2022☆14Updated 7 months ago
- Semantically Structured Sentence Embeddings☆67Updated last month
- ☆16Updated last year
- Code and data for the IWSLT 2022 shared task on Formality Control for SLT☆21Updated last year
- ☆43Updated 3 months ago
- ☆22Updated last year
- Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages -- ACL 2023☆96Updated 7 months ago
- ☆21Updated 4 months ago
- German Text Embedding Clustering Benchmark☆15Updated 8 months ago
- Code for ACL 2022 paper "Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation"☆31Updated 2 years ago
- Neural models for detecting and masking personal information from texts☆14Updated last year
- BLOOM+1: Adapting BLOOM model to support a new unseen language☆70Updated 8 months ago
- Repository with code for MaChAmp: https://aclanthology.org/2021.eacl-demos.22/☆82Updated last month
- UDapter is a multilingual dependency parser that uses "contextual" adapters together with language-typology features for language-specifi…☆30Updated last year
- ☆35Updated 2 years ago
- A multi-lingual approach to AllenNLP CoReference Resolution along with a wrapper for spaCy.☆103Updated 7 months ago
- Evaluate language models using multiple choice items☆12Updated last week
- Automatically detect errors in annotated corpora.☆47Updated last year
- Examples for aligning, padding and batching sequence labeling data (NER) for use with pre-trained transformer models☆65Updated last year
- Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning☆29Updated last year
- Parallel corpora for the biomedical domain☆48Updated 4 months ago
- Code accompanying the submission "Structural Text Segmentation of Legal Documents" by Aumiller et al.☆96Updated last year
- Corresponding code repo for the paper at COLING 2020 - ARGMIN 2020: "DebateSum: A large-scale argument mining and summarization dataset"☆53Updated 2 years ago