santhoshtr / sfst
Stuttgart Finite State Transducer system
☆18Updated 3 months ago
Alternatives and similar repositories for sfst:
Users that are interested in sfst are comparing it to the libraries listed below
- This is a new backend implementation of the ANNIS linguistic search and visualization system.☆17Updated this week
- tesseractXplore a tesseract ease of use gui with full control☆21Updated 3 years ago
- Lexical data at Unicode☆67Updated 4 months ago
- Specification of the @OCR-D technical architecture, interface definitions and data exchange format(s)☆17Updated 5 months ago
- Ontologies of Linguistic Annotation. Machine-readable tagsets and annotation schemata for more than 100 languages.☆20Updated 2 months ago
- Measure the similarity of text corpora for 74 languages☆13Updated last year
- Natural Language Inflection in English☆11Updated 3 years ago
- A powerful, tagset-independent and theory-neutral meta model and API for storing, manipulating, and representing nearly all types of ling…☆15Updated last year
- FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (inclu…☆61Updated 8 months ago
- Manuals, lexica, OCR test data for PoCoTo and the profiler☆15Updated 3 years ago
- Crop And Splice Segments (of scanned pages)☆14Updated 5 years ago
- Faster, modernized fork of the language identification tool langid.py☆50Updated 2 months ago
- OCRopus model for Gothic print (Fraktur)☆18Updated 4 years ago
- User contributed (non Google) OCR models for Tesseract☆24Updated 3 months ago
- Efficient teacher-student models and scripts to make them☆49Updated last year
- Tools for TICCL☆14Updated last month
- A highly extensible plattform for conversion and manipulation of linguistic data between an unbound set of formats. Pepper can be used st…☆24Updated 3 weeks ago
- OCR-D post-correction with encoder-attention-decoder LSTMs☆13Updated 3 months ago
- Building and Using A Seed Corpus for the Human Language Project☆11Updated 6 years ago
- QA-tool for scans with corresponding ALTO-files☆22Updated 2 years ago
- Wrapper around pixel classifier☆9Updated 2 years ago
- morphologically informed POS tagging for German☆26Updated 3 years ago
- ↕️ Intuitive axiomatic retrieval experimentation.☆24Updated last month
- A python package to simulate typographical errors.☆31Updated last year
- The core repository for the Literary Theme Ontology Project.☆20Updated this week
- Parser for KAF NAF files written in Python☆16Updated 3 years ago
- Stand-off Text Annotation Model (STAM) is a data model for stand-off-text annotation where any information on a text is represented as an…☆17Updated 2 months ago
- Code and models for our CLEF-HIPE (Named Entity Processing on Historical Newspapers) submissions☆19Updated last year
- GOPHI: an AMR-to-English Verbalizer☆11Updated 4 years ago
- Visegrad+ Parliament API. Access to parliament data of Visegrad+ countries in a common data standard.☆11Updated 8 years ago