santhoshtr / sfstLinks
Stuttgart Finite State Transducer system
☆21Updated 2 months ago
Alternatives and similar repositories for sfst
Users that are interested in sfst are comparing it to the libraries listed below
Sorting:
- FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (inclu…☆65Updated last year
- Tools for scraping, annotating, and parsing morphological information from Wiktionary☆15Updated 5 years ago
- This is a new backend implementation of the ANNIS linguistic search and visualization system.☆18Updated this week
- Lexical data at Unicode☆70Updated last year
- An index data structure for approximate string search.☆23Updated 6 years ago
- Crop And Splice Segments (of scanned pages)☆14Updated 6 years ago
- In-browser OCR of Ancient Greek and Latin☆26Updated 3 weeks ago
- A generic, machine learning-based revision scoring system for MediaWiki☆91Updated last year
- OCR-D post-correction with encoder-attention-decoder LSTMs☆13Updated 5 months ago
- An efficient data structure for fast string similarity searches☆22Updated 4 years ago
- Code and models for our CLEF-HIPE (Named Entity Processing on Historical Newspapers) submissions☆19Updated 2 years ago
- Link Wikidata items to large catalogs☆96Updated 7 months ago
- A set of utilities for processing MediaWiki SQL dump data☆20Updated last year
- Text language detection basic on trigrams.☆16Updated 2 years ago
- 📜 Dehyphenation of broken text (mainly German), i.e., extracted from a PDF☆39Updated 3 years ago
- OCRopus model for Gothic print (Fraktur)☆18Updated 5 years ago
- Github mirror - our actual code is hosted with Gerrit (please see https://www.mediawiki.org/wiki/Developer_access for contributing)☆37Updated last year
- An online citation generator for Wikipedia☆31Updated last month
- Library of Congress coding standards☆30Updated last year
- A python package to simulate typographical errors.☆37Updated last year
- LaMachine - A software distribution of our in-house as well as some 3rd party NLP software - Virtual Machine, Docker, or local compilatio…☆68Updated 2 years ago
- 🔍 Mirror of https://gerrit.wikimedia.org/g/mediawiki/extensions/CirrusSearch. See https://www.mediawiki.org/wiki/Developer_access for co…☆43Updated this week
- ☆17Updated 2 months ago
- Clone of https://gitlab.com/scripta/escriptorium.git☆28Updated 2 months ago
- Automatically exported from code.google.com/p/guess-language☆53Updated last year
- Citation Classification using hybrid neural network model for Wikipedia References☆30Updated 2 years ago
- Python tools for interacting with Wikidata☆154Updated last year
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆53Updated 4 years ago
- A python module for word inflections designed for use with spaCy.☆93Updated 5 years ago
- code to remove "noise" from hOCR output of Tesseract OCR.☆14Updated 8 years ago