christos-c / bible-corpus
A multilingual parallel corpus created from translations of the Bible.
☆172Updated 3 months ago
Related projects: ⓘ
- Sentence aligner☆106Updated 3 years ago
- Efficient Low-Memory Aligner☆135Updated 2 weeks ago
- A character-wise tokenizer for morphologically rich languages☆27Updated 3 months ago
- A collection of tools for reading/processing the multilingual Bible corpus☆14Updated last year
- Various utilities for processing the data.☆203Updated this week
- Curated corpus of parallel data derived from versions of the Bible provided by eBible.org.☆51Updated last month
- Morfessor is a tool for unsupervised and semi-supervised morphological segmentation☆180Updated 3 years ago
- OpusFilter - Parallel corpus processing toolkit☆101Updated last month
- ☆67Updated last month
- List of research and engineering of NLP for American Native/Indigenous Languages.☆87Updated 3 years ago
- Bitextor generates translation memories from multilingual websites☆287Updated 3 months ago
- Improved Sentence Alignment in Linear Time and Space☆157Updated last year
- Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.☆148Updated 3 months ago
- Python framework for processing Universal Dependencies data☆55Updated last week
- A list of resources for conservation, development, and documentation of endangered, minority, and low or under-resourced human languages.☆34Updated last year
- ConllEditor is a tool to edit dependency syntax trees in CoNLL-U format.☆53Updated this week
- The Open Multilingual Wordnet☆58Updated 4 months ago
- ☆61Updated 4 months ago
- A minimal, pure Python library to interface with CoNLL-U format files.☆149Updated last year
- Efficient Markov Chain word alignment☆54Updated 3 years ago
- Automatic extraction of edited sentences from text edition histories.☆80Updated 2 years ago
- CONLL-U to Pandas DataFrame☆30Updated 6 years ago
- Repository for the Georgetown University Multilayer Corpus (GUM)☆87Updated last month
- Universal Dependencies online documentation☆269Updated this week
- A tool that locates, downloads, and extracts machine translation corpora☆145Updated 3 months ago
- Translation Memory Open-source Purifier☆32Updated last year
- A single model that parses Universal Dependencies across 75 languages. Given a sentence, jointly predicts part-of-speech tags, morphology…☆220Updated last year
- A tool for automatic spelling normalization☆20Updated 3 years ago
- Machine-Translation-based sentence alignment tool for parallel text☆295Updated 3 years ago
- STREUSLE: a corpus with comprehensive lexical semantic annotation (multiword expressions, supersenses)☆63Updated last year