Tools for extracting parallel corpora from article titles across languages in Wikipedia
☆74Feb 25, 2015Updated 11 years ago
Alternatives and similar repositories for wikipedia-parallel-titles
Users that are interested in wikipedia-parallel-titles are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Lab exercises for the DL4MT winter school at DCU☆15Oct 21, 2015Updated 10 years ago
- Cynical data selection☆20Jan 16, 2021Updated 5 years ago
- Neural machine translation implementation using dynet's python bindings☆17Jan 24, 2018Updated 8 years ago
- Decoding platform for machine translation research☆54Aug 24, 2019Updated 6 years ago
- ☆18Oct 5, 2017Updated 8 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- A repo for sharing language resources related to the outbreak (in machine readable format)☆25Sep 22, 2025Updated 8 months ago
- C++ implementation of Generalised Brown clustering and python scripts for feature generation☆41Apr 8, 2016Updated 10 years ago
- Fast structured perceptron sequential labeler☆15Dec 8, 2015Updated 10 years ago
- Efficient Markov Chain word alignment☆53Aug 1, 2021Updated 4 years ago
- BiCVM Code☆45May 14, 2018Updated 8 years ago
- Will store links to known evaluation datasets alongside stats to characterize them☆24Mar 9, 2016Updated 10 years ago
- English - Indonesian parallel corpora☆17Aug 6, 2018Updated 7 years ago
- Train bilingual embeddings as described in our NAACL 2015 workshop paper "Bilingual Word Representations with Monolingual Quality in Mind…☆79Jun 15, 2019Updated 6 years ago
- A little text processing library for Scala.☆28Mar 3, 2016Updated 10 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆44Nov 30, 2017Updated 8 years ago
- Graph-based Dependency Parser☆47Jan 25, 2016Updated 10 years ago
- Witwicky: An implementation of Transformer in PyTorch.☆22Aug 17, 2020Updated 5 years ago
- ☆12Dec 9, 2015Updated 10 years ago
- Gale&Church (1993) sentence alignment☆16May 9, 2020Updated 6 years ago
- This repository contains additional reference translations for the WMT'14 En-De (newstest2014) and WMT'19 En-Ru (newstest2019) test sets …☆15Aug 31, 2021Updated 4 years ago
- Latent-variable Synchronous Context-Free Grammar Toolkit☆10Sep 30, 2014Updated 11 years ago
- A list of Neural MT implementations☆364Jul 27, 2022Updated 3 years ago
- C++/CUDA toolkit for training sequence and sequence-to-sequence models across multiple GPUs☆185May 15, 2017Updated 9 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- A fast, simple, multilingual tokenizer☆29May 24, 2017Updated 9 years ago
- ☆21Dec 9, 2016Updated 9 years ago
- Simple CORPORA list crawler☆11Dec 2, 2016Updated 9 years ago
- Fast Word Clustering Software☆79Feb 8, 2025Updated last year
- Lazy python recipes.☆10Apr 17, 2026Updated last month
- ☆34Nov 29, 2016Updated 9 years ago
- A High-Quality Multilingual Dataset for Structured Documentation Translation☆39May 1, 2025Updated last year
- Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.☆160Jun 18, 2024Updated last year
- A tool for extracting plain text from Wikipedia dumps☆15Sep 13, 2018Updated 7 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Open-Source Neural Machine Translation in Tensorflow☆803Dec 9, 2022Updated 3 years ago
- scripts and configuration files for Edinburgh neural MT submission to WMT 16 shared translation task☆139Nov 5, 2020Updated 5 years ago
- Easy Bootstrap Resampling and Approximate Randomization for BLEU, METEOR, and TER using Multiple Optimizer Runs. This implements "Better …☆205Feb 25, 2023Updated 3 years ago
- Top-Down BTG-based Preordering☆16Jan 14, 2016Updated 10 years ago
- A rule-based machine translation system from Ottoman Turkish to Modern Turkish.☆23Jul 8, 2020Updated 5 years ago
- Multilingual image description☆45Feb 9, 2018Updated 8 years ago
- ☆121Mar 15, 2017Updated 9 years ago