BramVanroy / fietje-2
An open, efficient LLM for Dutch
☆44Updated last month
Alternatives and similar repositories for fietje-2:
Users that are interested in fietje-2 are comparing it to the libraries listed below
- Evaluation of language models on mono- or multilingual tasks.☆81Updated this week
- GEITje 7B: een groot open Nederlands taalmodel☆124Updated 3 weeks ago
- A project for training foundational Danish language model☆71Updated last week
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆55Updated 6 months ago
- A Scandinavian Benchmark for sentence embeddings☆33Updated last week
- Robust and fast topic models with sentence-transformers.☆43Updated this week
- Norwegian Transformer Model☆115Updated 2 months ago
- A spaCy custom component that extracts and normalizes temporal expressions☆54Updated 2 years ago
- A repository containing the code for translating popular LLM benchmarks to German.☆25Updated last year
- Code to create the dataset from "A New Aligned Simple German Corpus☆10Updated last year
- Generalist and Lightweight Model for Text Classification☆79Updated this week
- Information extraction from English and German texts based on predicate logic☆135Updated last year
- SpaCyEx allows the creation of spaCy Matcher patterns with RegEx like syntax.☆59Updated 9 months ago
- a unified framework for leveraging LLMs☆67Updated this week
- The 🌟ANITA project🌟 *(Advanced Natural-based interaction for the ITAlian language)* wants to provide Italian NLP researchers with an im…☆17Updated 5 months ago
- A High-level Library for Named Entity Recognition in Python.☆23Updated last year
- Augmenty is an augmentation library based on spaCy for augmenting texts.☆151Updated 8 months ago
- 💫 SpaCy wrapper for ConceptNet 💫☆89Updated last year
- Repository for the word embeddings experiments described in "Evaluating Unsupervised Dutch Word Embeddings as a Linguistic Resource", pre…☆83Updated 3 years ago
- Repository collecting resources and best practices to improve experimental rigour in deep learning research.☆27Updated last year
- The central repo for Creole based NLU and NLG work☆17Updated 8 months ago
- ☆108Updated last year
- 🗺️ Data Cleaning and Textual Data Visualization 🗺️☆163Updated 8 months ago
- An easy way to chunk spaCy docs.☆19Updated 6 months ago
- ☆26Updated 6 months ago
- Repository containing the code for training the CroissantLLM☆21Updated last year
- ☆67Updated 11 months ago
- A Python library aimed at dissecting and augmenting NER training data.☆58Updated last year
- Sentiment Corpus for Swedish 🇸🇪 Norwegian 🇳🇴 Danish 🇩🇰 Finnish 🇫🇮 (and English 🏴)☆15Updated 3 years ago