santhoshtr / sfst
Stuttgart Finite State Transducer system
☆19Updated 5 months ago
Alternatives and similar repositories for sfst:
Users that are interested in sfst are comparing it to the libraries listed below
- Specification of the @OCR-D technical architecture, interface definitions and data exchange format(s)☆17Updated last week
- Lexical data at Unicode☆68Updated 7 months ago
- This is a new backend implementation of the ANNIS linguistic search and visualization system.☆17Updated 3 weeks ago
- Tools for scraping, annotating, and parsing morphological information from Wiktionary☆13Updated 5 years ago
- Stand-off Text Annotation Model (STAM) is a data model for stand-off-text annotation where any information on a text is represented as an…☆18Updated 4 months ago
- Manuals, lexica, OCR test data for PoCoTo and the profiler☆15Updated 3 years ago
- OCR-D post-correction with encoder-attention-decoder LSTMs☆13Updated 6 months ago
- Loadable spellfix1 extension for sqlite as python package☆26Updated 11 months ago
- TEI Reader Python Library☆17Updated last year
- FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (inclu…☆63Updated 10 months ago
- Tools for TICCL☆14Updated 3 months ago
- An efficient data structure for fast string similarity searches☆22Updated 4 years ago
- Ontologies of Linguistic Annotation. Machine-readable tagsets and annotation schemata for more than 100 languages.☆20Updated 4 months ago
- Crop And Splice Segments (of scanned pages)☆14Updated 6 years ago
- The Wikinflection Corpus, from the paper "Wikinflection Corpus: A (Better) Multilingual, Morpheme-Annotated Inflectional Corpus" (Metheni…☆12Updated last year
- An OCR evaluation tool☆65Updated last month
- A powerful, tagset-independent and theory-neutral meta model and API for storing, manipulating, and representing nearly all types of ling…☆15Updated 2 years ago
- Converts MS Rich Text Format to XML☆10Updated 7 years ago
- Named entity annotation tool☆27Updated last year
- search interface for scholarly works☆85Updated 8 months ago
- PhiloLogic4☆38Updated 4 months ago
- Keyword extraction with spaCy☆31Updated 3 years ago
- OCRopus model for Gothic print (Fraktur)☆18Updated 5 years ago
- LaTex e-book editable template that also typesets to the book about writing the book☆15Updated 3 years ago
- Faster, modernized fork of the language identification tool langid.py☆55Updated 4 months ago
- WordNet-LMF formats☆21Updated last month
- Text readability metrics in Python.☆11Updated 11 years ago
- A selection of test lines of several early printed books as well as the corresponding individual OCRopus models and mixed models.☆10Updated 7 years ago
- A highly extensible plattform for conversion and manipulation of linguistic data between an unbound set of formats. Pepper can be used st…☆24Updated 3 months ago
- Transform unstructured document collections to structured Linked Data☆27Updated 2 weeks ago