uhermjakob / wildebeest
Scripts investigate, repair and normalize a wide range of text file problems at the character level.
☆18Updated 2 years ago
Alternatives and similar repositories for wildebeest:
Users that are interested in wildebeest are comparing it to the libraries listed below
- Bilingual sentence similarity classifier using Tensorflow☆20Updated 5 years ago
- ☆25Updated last year
- An easy-to-use library to linguistically compare one sentence and its words to another, in the same language or a different one. For inst…☆22Updated 3 years ago
- Library and command line utility to do approximate string matching of a source against a bitext index and get matched source and target.☆47Updated last month
- universal tokenizer☆15Updated 3 years ago
- OpusFilter - Parallel corpus processing toolkit☆104Updated this week
- Efficient Low-Memory Aligner☆140Updated 2 weeks ago
- A tiny BERT for low-resource monolingual models☆31Updated 4 months ago
- Automatic extraction of edited sentences from text edition histories.☆82Updated 2 years ago
- MAMMOTH: MAssively Multilingual Modular Open Translation @ Helsinki☆22Updated last month
- Alignment and annotation for comparable documents.☆22Updated 6 years ago
- A guide to building language technology in new languages.☆58Updated 3 years ago
- Multilingual sentence alignment using sentence embeddings☆106Updated 2 months ago
- SeqScore: Scoring for named entity recognition and other sequence labeling tasks☆22Updated 3 weeks ago
- Curriculum training☆16Updated 2 weeks ago
- A accurate multilingual word aligner based on LaBSE☆20Updated last year
- Linguistic and stylistic complexity measures for (literary) texts☆79Updated last year
- UFSAC is a resource containing all WordNet Sense Annotated Corpora, and a Java library for manipulating them☆37Updated 2 years ago
- List of corpora annotated for coreference for different languages☆17Updated 5 months ago
- An NLP pipeline for Hebrew☆36Updated 9 months ago
- SpanAlign: Sentence Alignment Method based on Cross-Language Span Prediction and ILP☆13Updated 3 years ago
- Repository for rstWeb, a browser based annotation interface for Rhetorical Structure Theory☆42Updated 3 months ago
- NTREX -- News Test References for MT Evaluation