Transform TMX to text
☆28Nov 23, 2022Updated 3 years ago
Alternatives and similar repositories for tmxt
Users that are interested in tmxt are comparing it to the libraries listed below
Sorting:
- ☆13Aug 23, 2024Updated last year
- Corset is a web-based data selection portal that helps you getting relevant data from massive amounts of parallel data.☆20Nov 6, 2023Updated 2 years ago
- Tool for manual evaluation of parallel sentences.☆15Jan 26, 2026Updated last month
- Tool to fix bitexts and tag near-duplicates for removal☆34Sep 4, 2025Updated 5 months ago
- Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.☆160Jun 18, 2024Updated last year
- Best Practices in Translation Memory Management☆47Dec 14, 2018Updated 7 years ago
- bin files☆13Jan 30, 2025Updated last year
- Corpus preprocessing☆100Mar 16, 2024Updated last year
- Fast Neural Machine Translation in C++ - development repository☆22May 12, 2024Updated last year
- A High-Quality Multilingual Dataset for Structured Documentation Translation☆37May 1, 2025Updated 10 months ago
- Language-agnostic political event coding using universal dependencies☆18Jun 4, 2019Updated 6 years ago
- OpusFilter - Parallel corpus processing toolkit☆115Feb 11, 2026Updated 2 weeks ago
- An educational tool to train, inspect, evaluate and translate using neural engines☆19Mar 13, 2025Updated 11 months ago
- Bitextor generates translation memories from multilingual websites☆301Nov 11, 2024Updated last year
- Terminology Dataset☆23Feb 27, 2020Updated 6 years ago
- ☆24Nov 29, 2017Updated 8 years ago
- ☆21Feb 13, 2023Updated 3 years ago
- Library for fast text representation and classification.☆31Jan 9, 2024Updated 2 years ago
- ☆24Jun 25, 2025Updated 8 months ago
- Program used to split text into segments☆28Oct 27, 2024Updated last year
- ☆29Jun 10, 2024Updated last year
- Tools for formatting WMT hypothesis and test sets in XML☆27Apr 18, 2025Updated 10 months ago
- Targetted language identifier, based on FastText and Hunspell.☆38Sep 4, 2025Updated 5 months ago
- ☆81Jan 30, 2026Updated last month
- ☆35Jun 15, 2023Updated 2 years ago
- Translation Memory Open-source Purifier☆35Nov 6, 2022Updated 3 years ago
- A parallel evaluation data set of SAP software documentation with document structure annotation☆14Jul 30, 2025Updated 7 months ago
- mReasoner is a unified computational implementation of the model theory of thinking and reasoning☆13Aug 17, 2023Updated 2 years ago
- Examples, tutorials and use cases for Marian, including our WMT-2017/18 baselines.☆81Apr 8, 2023Updated 2 years ago
- Containerfile for the Vanilla OS Desktop+Nvidia image.☆16Feb 5, 2026Updated 3 weeks ago
- ☆10Jul 6, 2023Updated 2 years ago
- C4RepSet: Representative Subset from C4 data for Training Pre-trained LMs☆11Jan 13, 2023Updated 3 years ago
- Fake NEWS detector using LIAR dataset.☆11Aug 19, 2019Updated 6 years ago
- Code that drives the public web-based tools for the Media Cloud Online News Archive and Directory.☆11Updated this week
- Wikimedia Enterprise - client SDK in Python☆20Nov 11, 2025Updated 3 months ago
- A php library for working with Table Schema.☆12Jul 28, 2025Updated 7 months ago
- Security research organization dedicated to finding low hanging, critical, vulnerabilities.☆15May 12, 2022Updated 3 years ago
- COMET for African languages☆10Jan 24, 2025Updated last year
- Code and data for the Walert large language model-based chatbot☆12Aug 14, 2025Updated 6 months ago