santhoshtr / sfstLinks
Stuttgart Finite State Transducer system
☆21Updated 2 months ago
Alternatives and similar repositories for sfst
Users that are interested in sfst are comparing it to the libraries listed below
Sorting:
- Lexical data at Unicode☆70Updated last year
- Crop And Splice Segments (of scanned pages)☆14Updated 6 years ago
- Python tools for interacting with Wikidata☆156Updated 2 years ago
- An efficient data structure for fast string similarity searches☆22Updated 4 years ago
- This is a new backend implementation of the ANNIS linguistic search and visualization system.☆18Updated 3 weeks ago
- FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (inclu…☆66Updated last year
- Flask Interface to Thompson's Motif Index☆18Updated 6 years ago
- an approximate string matching or fuzzy-matching system for spelling correction, normalisation or post-OCR correction (mirror of https://…☆37Updated 3 weeks ago
- Targetted language identifier, based on FastText and Hunspell.☆37Updated last month
- Tools for scraping, annotating, and parsing morphological information from Wiktionary☆15Updated 6 years ago
- Python based Wikidata framework for easy dataframe extraction☆45Updated last year
- A python module for word inflections designed for use with spaCy.☆93Updated 5 years ago
- search interface for scholarly works☆84Updated last year
- Next-generation Punkt sentence boundary detection with zero dependencies☆20Updated 2 months ago
- 📜 Dehyphenation of broken text (mainly German), i.e., extracted from a PDF☆39Updated 3 years ago
- 🌸 Train floret vectors☆18Updated 2 years ago
- Tools for working with book data☆18Updated 2 weeks ago
- Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.☆46Updated 7 years ago
- Named entity recognition for the legal domain☆42Updated 4 years ago
- Aksharamukha Python Library☆54Updated 8 months ago
- Quickly turn command-line applications into RESTful webservices with a web-application front-end. You provide a specification of your com…☆133Updated last week
- WordNet-LMF formats☆24Updated 3 months ago
- Automatically exported from code.google.com/p/guess-language☆53Updated last week
- Wikidata authority file mapping tool☆11Updated 7 years ago
- A highly extensible plattform for conversion and manipulation of linguistic data between an unbound set of formats. Pepper can be used st…☆24Updated 9 months ago
- DKPro C4CorpusTools is a collection of tools for processing CommonCrawl corpus, including Creative Commons license detection, boilerplate…☆52Updated 5 years ago
- A set of utilities for processing MediaWiki XML dump data.☆57Updated 8 months ago
- Codemeta paper.☆10Updated 8 years ago
- A python package to simulate typographical errors.☆38Updated last year
- Advanced desktop search/corpus exploration prototype☆21Updated 4 years ago