alvations / gachalign
Gale-Church sentence aligner with options for variable parameters
☆17Updated 5 years ago
Alternatives and similar repositories for gachalign:
Users that are interested in gachalign are comparing it to the libraries listed below
- Efficient Markov Chain word alignment☆53Updated 3 years ago
- Efficient Low-Memory Aligner☆141Updated last month
- Alignment and annotation for comparable documents.☆22Updated 6 years ago
- Automatic extraction of edited sentences from text edition histories.☆82Updated 3 years ago
- OpusFilter - Parallel corpus processing toolkit☆104Updated 3 weeks ago
- Tool for comparison and evaluation of machine translation.☆56Updated 2 years ago
- Appraise evaluation system for manual evaluation of machine translation output☆74Updated 3 years ago
- Sentence aligner☆109Updated 3 years ago
- Translation Error Rate (TER)☆43Updated 6 years ago
- Scripts to preprocess training and test data and to run fast_align and giza☆108Updated 3 years ago
- Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.☆154Updated 8 months ago
- Tools for filtering and cleaning parallel and monolingual corpora for machine translation and other natural language processing tasks.☆41Updated last year
- ☆23Updated 5 years ago
- ☆42Updated 6 years ago
- MAGPIE: A sense-annotated corpus of potentially idiomatic expressions☆26Updated 4 years ago
- ☆45Updated 6 months ago
- Improved Sentence Alignment in Linear Time and Space☆165Updated last year
- English HPSG parser☆51Updated 6 years ago
- Tools for extracting parallel corpora from article titles across languages in Wikipedia☆72Updated 9 years ago
- A Supervised Word Alignment Method based on Cross-Language Span Prediction using Multilingual BERT☆26Updated 4 years ago
- Repository accompanying "An Open Dataset and Model for Language Identification" (Burchell et al., 2023)☆70Updated 9 months ago
- ☆29Updated 4 years ago
- A High-Quality Multilingual Dataset for Structured Documentation Translation☆35Updated 7 months ago
- MorphyNet: a Large Multilingual Database of Derivational and Inflectional Morphology (+morpheme segmentation)☆40Updated last year
- An initiative to collect and distribute resources for co-reference resolution in a unified standard.☆24Updated 9 months ago
- Pipelined quality estimation.☆51Updated 5 years ago
- Corpus preprocessing☆95Updated 11 months ago
- Easier Automatic Sentence Simplification Evaluation☆160Updated last year
- MT Evaluation in Many Languages via Zero-Shot Paraphrasing☆101Updated 6 months ago
- Code and data for: Low Resource Grammatical Error Correction Using Wikipedia Edits (WNUT 2018)☆14Updated 7 months ago