Compound splitter for German language ("Komposita-Zerlegung") based on large dictionary combined with highly efficient multi-pattern string search
☆35Jul 7, 2022Updated 3 years ago
Alternatives and similar repositories for german_compound_splitter
Users that are interested in german_compound_splitter are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Repo for the simplified text alignment tools.☆21Dec 4, 2020Updated 5 years ago
- Adnabod lleferydd Cymraeg i'r Gymraeg gyda HuggingFace // Speech Recognition for Welsh with HuggingFace☆13Nov 29, 2022Updated 3 years ago
- A tokenizer and sentence splitter for German and English web and social media texts.☆152Dec 9, 2024Updated last year
- Python code to automatically produce a summary of a piece of text.☆11Sep 8, 2016Updated 9 years ago
- IPA Phonetic dataset lexicon☆18May 26, 2026Updated 2 weeks ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- German Language Understanding Evaluation Benchmark @NAACL24☆23Dec 11, 2025Updated 6 months ago
- Legal Reference Extraction☆48May 12, 2026Updated last month
- A lemmatizer for German language text☆95Feb 7, 2023Updated 3 years ago
- A neural network hyphenator for the German language☆45Oct 25, 2023Updated 2 years ago
- 🫠 check your data, before you wreck your model☆16Aug 11, 2022Updated 3 years ago
- Coqui Inference Engine☆41Aug 3, 2021Updated 4 years ago
- Training and evaluation code for the paper "Headless Language Models: Learning without Predicting with Contrastive Weight Tying" (https:/…☆29Apr 17, 2024Updated 2 years ago
- Curated list of open-access/open-source/off-the-shelf resources and tools developed with a particular focus on German☆526Oct 30, 2024Updated last year
- How loud is that file?☆12Sep 3, 2019Updated 6 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- The source code for the TIRA Shared Task Platform☆17Updated this week
- XWikisCorpus, cross-lingual summarisation, multi-lingual summarisation, pre-trained language models, zero-shot and few-shot summarisation…☆10Nov 4, 2022Updated 3 years ago
- Awesome stuff made by the Mycroft community☆13Sep 16, 2021Updated 4 years ago
- "Learning Rhyming Constraints using Structured Adversaries. Jhamtani H., Mehta S., Carbonell J., Berg-Kirkpatrick T. EMNLP-IJCNLP (Short …☆11Mar 17, 2020Updated 6 years ago
- Wikipedia text corpus for self-supervised NLP model training☆47Jul 17, 2022Updated 3 years ago
- Wrapper for the yr.no weather service API.☆15Apr 12, 2018Updated 8 years ago
- The NLPStatTest project☆12Mar 12, 2022Updated 4 years ago
- Poems retrieval demo built with GNES framework☆14Oct 3, 2019Updated 6 years ago
- ☆11Dec 8, 2022Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- A small package for handy conversion of german numerals (also ordinal / signed) written as words to numbers.☆12Jan 22, 2026Updated 4 months ago
- TSAR2022 Shared Task on Lexical Simplification - Datasets and Evaluation scripts☆10Oct 27, 2022Updated 3 years ago
- SMOR (Stuttgart Morphology) with alternative lemmatization component☆13Aug 10, 2023Updated 2 years ago
- Small-vocabulary neural sequence-to-sequence generation with optional feature conditioning☆36Jun 8, 2026Updated last week
- Simple word to frequency mappings for the german language based on text corpora and using CISTEM stemmer.☆14Apr 3, 2021Updated 5 years ago
- Toolkit to obtain and preprocess German text corpora, train models and evaluate them with generated testsets. Built with Gensim and Tenso…☆242Updated this week
- Interface for using TTS and vocoder models in the form of a text editor☆20Nov 25, 2025Updated 6 months ago
- Code to create the dataset from "A New Aligned Simple German Corpus☆11Jan 8, 2024Updated 2 years ago
- Scripts to simplify data prepping for Mozilla DeepSpeech.☆14Aug 6, 2019Updated 6 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- X-SCITLDR: Cross-Lingual Extreme Summarization of Scholarly Documents (JCDL 2022)☆14Jul 22, 2022Updated 3 years ago
- Open German WordNet☆101Jan 7, 2026Updated 5 months ago
- A fully-fledge PyTorch package for Morphological Analysis, tailored to morphologically rich and historical languages.☆25Oct 27, 2023Updated 2 years ago
- Home surveillance system with facial recognition☆17Jun 10, 2020Updated 6 years ago
- Lyrics and Vocal Melody Generation conditioned on Accompaniment☆28Aug 27, 2022Updated 3 years ago
- Poetry Corpora Annotated on Aesthetic Emotions☆13Aug 2, 2022Updated 3 years ago
- Code supporting the paper Graph-Embedding Empowered Entity Retrieval☆24Apr 11, 2025Updated last year