BramVanroy / fietje-2Links
An open, efficient LLM for Dutch
☆54Updated 2 months ago
Alternatives and similar repositories for fietje-2
Users that are interested in fietje-2 are comparing it to the libraries listed below
Sorting:
- The robust European language model benchmark.☆120Updated this week
- GEITje 7B: een groot open Nederlands taalmodel☆128Updated 7 months ago
- The website for Danish Foundation Models, a project for training foundational Danish language model.☆74Updated last week
- Norwegian Transformer Model☆116Updated 9 months ago
- Lightweight self-hosted span annotation tool☆34Updated 2 weeks ago
- Camoscio: An Italian instruction-tuned language model based on LLaMA☆127Updated last year
- REMERGE - Multi-Word Expression discovery algorithm☆14Updated 2 years ago
- This repository contains a demonstrative implementation for pooling-based models, e.g., DeepPyramidion complementing our paper "Sparsifyi…☆14Updated 3 years ago
- German Language Understanding Evaluation Benchmark @NAACL24☆15Updated last month
- AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.☆32Updated 5 months ago
- A Scandinavian Benchmark for sentence embeddings☆40Updated 3 months ago
- LTG-Bert☆33Updated last year
- A Python library aimed at dissecting and augmenting NER training data.☆58Updated 2 years ago
- SpaCyEx allows the creation of spaCy Matcher patterns with RegEx like syntax.☆59Updated last year
- ☆17Updated 2 years ago
- Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.☆83Updated 11 months ago
- German Text Embedding Clustering Benchmark☆18Updated last year
- A repository containing the code for translating popular LLM benchmarks to German.☆28Updated 2 years ago
- A french sequence to sequence pretrained model☆62Updated 3 years ago
- ☆110Updated last year
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Do…☆82Updated last year
- A spaCy custom component that extracts and normalizes temporal expressions☆55Updated 2 years ago
- ☆49Updated last year
- KIND: an Italian Multi-Domain Dataset for Named Entity Recognition☆15Updated 2 years ago
- BERTje is a Dutch pre-trained BERT model developed at the University of Groningen. (EMNLP Findings 2020) "What’s so special about BERT’s …☆139Updated 2 years ago
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆60Updated last year
- Code for SaGe subword tokenizer (EACL 2023)☆26Updated 9 months ago
- A software for transferring pre-trained English models to foreign languages☆18Updated 2 years ago
- Personal information identification standard☆21Updated last year
- DaCy: The State of the Art Danish NLP pipeline using SpaCy☆97Updated 8 months ago