repodiac/german_compound_splitter

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/repodiac/german_compound_splitter)

repodiac / german_compound_splitter

Compound splitter for German language ("Komposita-Zerlegung") based on large dictionary combined with highly efficient multi-pattern string search

☆36

Alternatives and similar repositories for german_compound_splitter

Users that are interested in german_compound_splitter are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

dtuggener / CharSplit
View on GitHub
Compound splitter for German
☆114Apr 5, 2020Updated 6 years ago
techiaith / docker-huggingface-stt-cy
View on GitHub
Adnabod lleferydd Cymraeg i'r Gymraeg gyda HuggingFace // Speech Recognition for Welsh with HuggingFace
☆13Nov 29, 2022Updated 3 years ago
domcross / german-stt-evaluation
View on GitHub
Evaluation of STT models for german language
☆16Jan 22, 2022Updated 4 years ago
neosyon / SimpTextAlign
View on GitHub
Repo for the simplified text alignment tools.
☆21Dec 4, 2020Updated 5 years ago
ghpaetzold / massalign
View on GitHub
Alignment and annotation for comparable documents.
☆22Oct 16, 2018Updated 7 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Germanet-sfs / germanetpy
View on GitHub
☆12Jun 10, 2026Updated last month
repodiac / german_transliterate
View on GitHub
Python module to clean and transliterate (i.e. normalize) German text including abbreviations, numbers, timestamps etc. It can be used to…
☆39Jan 16, 2021Updated 5 years ago
EdCo95 / text-summarization
View on GitHub
Python code to automatically produce a summary of a piece of text.
☆11Sep 8, 2016Updated 9 years ago
rhasspy / espeak-phonemizer
View on GitHub
Uses ctypes and libespeak-ng to transform test into IPA phonemes
☆26Sep 20, 2023Updated 2 years ago
G-Research / fast-string-search
View on GitHub
☆13Apr 13, 2021Updated 5 years ago
stefan-it / ukrainian-electra
View on GitHub
Ukrainian ELECTRA model
☆12Mar 11, 2023Updated 3 years ago
lauhaide / clads
View on GitHub
XWikisCorpus, cross-lingual summarisation, multi-lingual summarisation, pre-trained language models, zero-shot and few-shot summarisation…
☆10Nov 4, 2022Updated 3 years ago
coqui-ai / data-checker
View on GitHub
🫠 check your data, before you wreck your model
☆16Aug 11, 2022Updated 3 years ago
msiemens / HypheNN-de
View on GitHub
A neural network hyphenator for the German language
☆45Oct 25, 2023Updated 2 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
adbar / German-NLP
View on GitHub
Curated list of open-access/open-source/off-the-shelf resources and tools developed with a particular focus on German
☆527Oct 30, 2024Updated last year
rhasspy / glow-speak
View on GitHub
Neural text to speech system that uses eSpeak as a text/phoneme front-end
☆16Oct 20, 2021Updated 4 years ago
LaSTUS-TALN-UPF / TSAR-2022-Shared-Task
View on GitHub
TSAR2022 Shared Task on Lexical Simplification - Datasets and Evaluation scripts
☆10Oct 27, 2022Updated 3 years ago
GermanT5 / wikipedia2corpus
View on GitHub
Wikipedia text corpus for self-supervised NLP model training
☆47Jul 17, 2022Updated 4 years ago
ChanceNCounter / awesome-mycroft-community
View on GitHub
Awesome stuff made by the Mycroft community
☆12Sep 16, 2021Updated 4 years ago
masakhane-io / masakhanePreprocessor
View on GitHub
Building an effective preprocessing tool for African languages
☆13Jan 24, 2024Updated 2 years ago
nlp-stat-test / nlp-stat-test
View on GitHub
The NLPStatTest project
☆12Mar 12, 2022Updated 4 years ago
julmaxi / Abstractive-Timeline-Summarization
View on GitHub
☆11Dec 8, 2022Updated 3 years ago
harsh19 / Structured-Adversary
View on GitHub
"Learning Rhyming Constraints using Structured Adversaries. Jhamtani H., Mehta S., Carbonell J., Berg-Kirkpatrick T. EMNLP-IJCNLP (Short …
☆11Mar 17, 2020Updated 6 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
sloev / spacy-syllables
View on GitHub
Multilingual syllable annotation pipeline component for spacy
☆39Mar 8, 2023Updated 3 years ago
helboukkouri / embedding-visualization
View on GitHub
This is a project for visualizing word embeddings based on the work of Andrei Kashcha (@anvaka).
☆24Mar 29, 2019Updated 7 years ago
buschmo / Simple-German-Corpus
View on GitHub
Code to create the dataset from "A New Aligned Simple German Corpus
☆11Jan 8, 2024Updated 2 years ago
evanshortiss / yr.no-interface
View on GitHub
Wrapper for the yr.no weather service API.
☆15Apr 12, 2018Updated 8 years ago
Wortmeister-HQ / zahlwort2num
View on GitHub
A small package for handy conversion of german numerals (also ordinal / signed) written as words to numbers.
☆12Jan 22, 2026Updated 6 months ago
NC0DER / GraphOfDocs
View on GitHub
GraphOfDocs: Representing multiple documents as a single graph
☆21Jun 22, 2022Updated 4 years ago
olastor / german-word-frequencies
View on GitHub
Simple word to frequency mappings for the german language based on text corpora and using CISTEM stemmer.
☆14Apr 3, 2021Updated 5 years ago
openlegaldata / legal-reference-extraction
View on GitHub
Legal Reference Extraction
☆49Jun 15, 2026Updated last month
emanjavacas / pie
View on GitHub
A fully-fledge PyTorch package for Morphological Analysis, tailored to morphologically rich and historical languages.
☆25Oct 27, 2023Updated 2 years ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
dapascual / K2T
View on GitHub
☆70Oct 29, 2021Updated 4 years ago
smaybius / Coqui-TTS-GUI-solution
View on GitHub
Interface for using TTS and vocoder models in the form of a text editor
☆20Nov 25, 2025Updated 8 months ago
sobamchan / xscitldr
View on GitHub
X-SCITLDR: Cross-Lingual Extreme Summarization of Scholarly Documents (JCDL 2022)
☆14Jul 22, 2022Updated 4 years ago
mrquincle / ancient-c-compilers
View on GitHub
Very old C compilers
☆29Aug 12, 2014Updated 11 years ago
informagi / GEEER
View on GitHub
Code supporting the paper Graph-Embedding Empowered Entity Retrieval
☆24Apr 11, 2025Updated last year
valentinhofmann / flota
View on GitHub
☆18Feb 1, 2023Updated 3 years ago
tsproisl / SoMeWeTa
View on GitHub
A part-of-speech tagger with support for domain adaptation and external resources.
☆24Oct 26, 2022Updated 3 years ago