loomchild / maligna
Bilingual sengence aligner
☆27Updated last year
Alternatives and similar repositories for maligna:
Users that are interested in maligna are comparing it to the libraries listed below
- Data collection, alignment and TAUS repository☆23Updated 7 years ago
- Sentence aligner☆110Updated 3 years ago
- ☆12Updated 9 years ago
- Alignment and annotation for comparable documents.☆22Updated 6 years ago
- Tools for extracting parallel corpora from article titles across languages in Wikipedia☆72Updated 10 years ago
- ParCourE - Parallel Corpus Explorer☆12Updated 3 years ago
- GC4LM: A Colossal (Biased) language model for German☆13Updated 3 years ago
- Efficient Low-Memory Aligner☆142Updated last month
- Program used to split text into segments☆25Updated 4 months ago
- ☆21Updated 5 years ago
- An easy-to-use library to linguistically compare one sentence and its words to another, in the same language or a different one. For inst…☆22Updated 3 years ago
- Transform TMX to text☆28Updated 2 years ago
- CoNLL 2018 Shared Task Team UDPipe-Future☆39Updated 4 years ago
- List of corpora annotated for coreference for different languages☆17Updated 6 months ago
- Neural macine translation soft alignment visualisations for web and command line☆72Updated 3 years ago
- Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures Translation☆14Updated 6 months ago
- Identifying Historical People, Places and other Entities: Shared Task on Named Entity Recognition and Linking on Historical Newspapers at…☆22Updated 7 months ago
- Converter from UD-trees to BART representation☆36Updated 11 months ago
- ☆42Updated 6 years ago
- ☆33Updated 3 years ago
- Efficient Markov Chain word alignment☆53Updated 3 years ago
- These are lists for a variety of languages containing words that are distinctive to each language.☆35Updated 2 years ago
- Tool to fix bitexts and tag near-duplicates for removal☆29Updated 3 weeks ago
- Tool for parsing and converting various span encoding schemes.☆22Updated last year
- Multi Tier Annotation Search☆26Updated 3 years ago
- BERT models for many languages created from Wikipedia texts☆33Updated 4 years ago
- Compiled tools, datasets, and other resources for historical text normalization.☆18Updated 5 years ago
- NanigoNet — Language detector for code-mixed input supporting 150+19 human+programming languages using deep neural networks☆72Updated last year
- Appraise evaluation system for manual evaluation of machine translation output☆74Updated 3 years ago
- UFSAC is a resource containing all WordNet Sense Annotated Corpora, and a Java library for manipulating them☆37Updated 2 years ago