santhoshtr / sfstLinks
Stuttgart Finite State Transducer system
☆20Updated 10 months ago
Alternatives and similar repositories for sfst
Users that are interested in sfst are comparing it to the libraries listed below
Sorting:
- This is a new backend implementation of the ANNIS linguistic search and visualization system.☆18Updated this week
- Lexical data at Unicode☆68Updated 11 months ago
- FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (inclu…☆65Updated last year
- A highly extensible plattform for conversion and manipulation of linguistic data between an unbound set of formats. Pepper can be used st…☆24Updated 7 months ago
- OCR-D post-correction with encoder-attention-decoder LSTMs☆13Updated 3 months ago
- Crop And Splice Segments (of scanned pages)☆14Updated 6 years ago
- Named entity annotation tool☆28Updated 2 years ago
- extract text from ALTO file☆9Updated last year
- Tools for TICCL☆14Updated 2 months ago
- Clone of https://gitlab.com/scripta/escriptorium.git☆26Updated last week
- OCRopus model for Gothic print (Fraktur)☆18Updated 5 years ago
- Double-checked Gold Standard Data for Training and Testing OCR Engines☆18Updated 2 years ago
- WordNet-LMF formats☆22Updated 3 weeks ago
- Tools for working with book data☆18Updated 2 months ago
- A sentence segmentation library with wide language support optimized for speed and utility.☆66Updated last month
- Faster, modernized fork of the language identification tool langid.py☆56Updated 8 months ago
- Stand-off Text Annotation Model (STAM) is a data model for stand-off-text annotation where any information on a text is represented as an…☆21Updated 3 weeks ago
- Manuals, lexica, OCR test data for PoCoTo and the profiler☆15Updated 4 years ago
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Fo…☆105Updated last week
- Tools for scraping, annotating, and parsing morphological information from Wiktionary☆15Updated 5 years ago
- QA-tool for scans with corresponding ALTO-files☆26Updated 2 years ago
- A suite of batches and tools for OCR tasks.☆71Updated 2 years ago
- tesseractXplore a tesseract ease of use gui with full control☆24Updated 3 years ago
- A code for transliterating (romanizing) Arabic text using the American Library Association - Library of Congress (ALA-LC) standard☆48Updated 3 years ago
- German part-of-speech dictionary☆45Updated last year
- 🗣 Multilingual RDF Verbalizer – Google Summer of Code 2019☆22Updated 2 years ago
- Helsinki Finite-State Technology (library and application suite)☆133Updated 2 months ago
- Gramadán: a computational grammar of Irish☆15Updated 2 years ago
- ☆10Updated 2 years ago
- A set of utilities for processing MediaWiki XML dump data.☆57Updated 5 months ago