BramVanroy / fietje-2
An open, efficient LLM for Dutch
☆35Updated 2 months ago
Related projects ⓘ
Alternatives and complementary repositories for fietje-2
- Evaluation of language models on mono- or multilingual tasks.☆75Updated last week
- GEITje 7B: een groot open Nederlands taalmodel☆116Updated 3 weeks ago
- Repository for the EM German Model☆104Updated last year
- A spaCy wrapper for GliNER☆91Updated 4 months ago
- SpaCyEx allows the creation of spaCy Matcher patterns with RegEx like syntax.☆57Updated 6 months ago
- Page de préconfiguration de la communauté OpenLLM-France☆42Updated 9 months ago
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆53Updated 3 months ago
- A Scandinavian Benchmark for sentence embeddings☆28Updated last week
- 📚 Process PDFs, Word documents and more with spaCy☆75Updated this week
- ☆21Updated last week
- ☆314Updated this week
- A collection of Italian benchmarks for LLM evaluation☆22Updated 3 weeks ago
- A repository containing the code for translating popular LLM benchmarks to German.☆24Updated last year
- Libraries, Archives and Museums (LAM)☆82Updated 2 years ago
- 🧪 Experimental features for Haystack☆23Updated this week
- A spaCy custom component that extracts and normalizes temporal expressions☆52Updated last year
- A project for training foundational Danish language model☆68Updated this week
- ☆68Updated 8 months ago
- Repository for the word embeddings experiments described in "Evaluating Unsupervised Dutch Word Embeddings as a Linguistic Resource", pre…☆82Updated 3 years ago
- 💥 Use Hugging Face text and token classification pipelines directly in spaCy☆62Updated 8 months ago
- Generalist and Lightweight Model for Relation Extraction (Extract any relationship types from text)☆128Updated this week
- AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.☆28Updated last year
- The official repository for Toxic Commons and Celadon. Toxicity Classification for public domain data.☆9Updated last week
- I.PHI dataset generation☆25Updated 11 months ago
- A software for transferring pre-trained English models to foreign languages☆18Updated last year
- Lightweight self-hosted span annotation tool☆23Updated last week
- Chunk your text using gpt4o-mini more accurately☆42Updated 3 months ago
- Efficiently find the best-suited language model (LM) for your NLP task☆91Updated this week
- This repository contains an easy and intuitive approach to use SetFit in combination with spaCy.☆72Updated last year
- 💫 SpaCy wrapper for ConceptNet 💫☆88Updated last year