BramVanroy / fietje-2
An open, efficient LLM for Dutch
☆48Updated 3 months ago
Alternatives and similar repositories for fietje-2:
Users that are interested in fietje-2 are comparing it to the libraries listed below
- GEITje 7B: een groot open Nederlands taalmodel☆126Updated 3 months ago
- A project for training foundational Danish language model☆73Updated last month
- The robust European language model benchmark.☆101Updated this week
- Repository for the EM German Model☆109Updated last year
- T-scan: an analysis tool for dutch texts to assess the complexity of the text, based on original work by Rogier Kraf☆18Updated last week
- The central repo for Creole based NLU and NLG work☆18Updated 11 months ago
- A High-level Library for Named Entity Recognition in Python.☆23Updated last year
- Plan and train German transformer models.☆23Updated 4 years ago
- Repository for the word embeddings experiments described in "Evaluating Unsupervised Dutch Word Embeddings as a Linguistic Resource", pre…☆83Updated 3 years ago
- AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.☆31Updated last month
- The multilingual language model for Switzerland☆26Updated last year
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆58Updated 9 months ago
- Materials for "IT5: Large-scale Text-to-text Pretraining for Italian Language Understanding and Generation" 🇮🇹☆30Updated 10 months ago
- An Easy Annotation Tool for Natural Language Processing☆10Updated 11 months ago
- Repository collecting resources and best practices to improve experimental rigour in deep learning research.☆27Updated 2 years ago
- TextComplexityDE dataset consists of 1000 sentences in the German language with subjective complexity rating, collected from German learn…☆13Updated 3 years ago
- Compiled tools, datasets, and other resources for historical text normalization.☆18Updated 5 years ago
- Code to create the dataset from "A New Aligned Simple German Corpus☆10Updated last year
- ☆47Updated 9 months ago
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Do…☆80Updated 9 months ago
- A multi-lingual approach to AllenNLP CoReference Resolution along with a wrapper for spaCy.☆106Updated last year
- This repo provides a python module to work with Open Dutch WordNet. It was created using python 3.4.☆65Updated 3 years ago
- A Scandinavian Benchmark for sentence embeddings☆36Updated 2 months ago
- Norwegian Named Entities annotations on top of NDT (Norwegian Dependency Treebank)☆69Updated 7 months ago
- Sentiment Corpus for Swedish 🇸🇪 Norwegian 🇳🇴 Danish 🇩🇰 Finnish 🇫🇮 (and English 🏴)☆15Updated 3 years ago
- Dutch coreference resolution & dialogue analysis using deterministic rules☆21Updated last year
- DaCy: The State of the Art Danish NLP pipeline using SpaCy☆95Updated 4 months ago
- Named Entity Recognition (LSTM + CRF + FastText) with models for [historic] German☆26Updated 3 years ago
- COMBO is jointly trained tagger, lemmatizer and dependency parser.☆35Updated 2 years ago
- Annotated corpus + evaluation metrics for text anonymisation☆55Updated last year