gambolputty / wiktionary-de-parserLinks
Extract data from German Wiktionary XML files.
☆26Updated 5 months ago
Alternatives and similar repositories for wiktionary-de-parser
Users that are interested in wiktionary-de-parser are comparing it to the libraries listed below
Sorting:
- A list of ~100,000 German nouns and their grammatical properties compiled from WiktionaryDE as CSV file. Plus a module to look up the dat…☆153Updated 5 months ago
- Generation of bilingual dictionaries from Wiktionary/dbnary data for the WikDict project☆49Updated 7 months ago
- Tools for scraping, annotating, and parsing morphological information from Wiktionary☆14Updated 5 years ago
- Browser extension adding shortcuts to DWDS queries☆8Updated 5 months ago
- German part-of-speech dictionary☆45Updated last year
- A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning tech…☆72Updated 6 months ago
- Deutsches Lyrik Korpus (DLK) / German Poetry Corpus☆18Updated last year
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiency☆160Updated this week
- Morphological Dictionaries for German Language☆29Updated 7 years ago
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Fo…☆101Updated 2 weeks ago
- Anki add-on to look up vocabulary using Wiktionary☆18Updated 3 months ago
- Tools for creating DSL-format dictionaries☆15Updated 3 years ago
- Wiktionary parser tool for many language editions.☆54Updated 2 years ago
- IWNLP: A parser for the German edition of Wiktionary☆13Updated last year
- Open morphology for Finnish☆90Updated last month
- Helsinki Finite-State Technology (library and application suite)☆130Updated last week
- DTA Base Format (DTABf)☆18Updated 2 months ago
- Editor for aligned parallel texts (personal desktop application).☆19Updated 4 years ago
- A Corpus Data Retrieval Index using Lucene for Look-Ups☆17Updated last week
- Conversions between various OCR formats☆78Updated 2 years ago
- A list of vocabulary lists☆21Updated 4 years ago
- NLP-helper for OCR-ed pages in PAGE XML format☆10Updated 6 months ago
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆51Updated 3 years ago
- A NoSketch Engine Docker image which is easy to use☆19Updated 7 months ago
- A Python Wiktionary Parser☆360Updated 3 months ago
- OCR-D python tools☆33Updated 9 months ago
- Creates interlinearized versions of books (EPUB, MOBI, etc), adding "subtitles" with translations under each word in the text.☆24Updated 4 years ago
- Compound splitter for German☆105Updated 5 years ago
- Open German WordNet☆95Updated last year
- TMX Editor written in Java and TypeScript☆44Updated 2 months ago