TUFS Asian Language Parallel Corpus
☆52May 1, 2023Updated 2 years ago
Alternatives and similar repositories for TALPCo
Users that are interested in TALPCo are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆10Jan 14, 2025Updated last year
- English - Indonesian parallel corpora☆17Aug 6, 2018Updated 7 years ago
- CRF syllable segmenter for Thai☆27May 3, 2024Updated last year
- Java library to tokenize Thai text into a list of TCCs☆19May 30, 2017Updated 8 years ago
- A public repository for corrupt0 datathon's court data☆11Jul 6, 2019Updated 6 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Unsupervised parallel sentence extraction from comparable corpora☆16Aug 6, 2019Updated 6 years ago
- Parallel Universal Dependencies.☆15Nov 12, 2025Updated 4 months ago
- Yaitron English-Thai and Thai-English dictionary☆34Oct 13, 2020Updated 5 years ago
- basically all words, in a compressed form☆17Jan 9, 2023Updated 3 years ago
- Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures Translation☆15Aug 27, 2024Updated last year
- Indonesian-English Bilingual Corpus☆18Jul 16, 2012Updated 13 years ago
- Official code and data of "3AM: An Ambiguity-Aware Multi-Modal Machine Translation Dataset"☆12Dec 8, 2024Updated last year
- Myanmar and Thai Language Resources☆10Jul 18, 2022Updated 3 years ago
- ☆42May 4, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆12Dec 7, 2022Updated 3 years ago
- Source code for the NAACL 2021 paper: Pruning-then-Expanding Model for Domain Adaptation of Neural Machine Translation☆15Jul 19, 2021Updated 4 years ago
- 🐍🍑 Python 3 library for managing, annotating, and converting natural language corpuses using popular formats (CoNLL, ELAN, Praat, CSV, …☆21Jun 26, 2024Updated last year
- ☆11Dec 14, 2020Updated 5 years ago
- A Dataset for Thai Text Summarization with over 310K articles.☆29Feb 4, 2023Updated 3 years ago
- The first-ever vast natural language generation benchmark for Indonesian, Sundanese, and Javanese. We provide multiple downstream tasks, …☆79Nov 16, 2024Updated last year
- Thai Spelling Check☆41Apr 2, 2023Updated 2 years ago
- A multi-language segmenter using high-order CRF.☆17Feb 27, 2020Updated 6 years ago
- Thai Grapheme to Phoneme (G2P) Wiktionary Corpus☆13Jul 25, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- GOPHI: an AMR-to-English Verbalizer☆11Feb 5, 2020Updated 6 years ago
- ☆14Dec 23, 2024Updated last year
- SnapLogic Snap Recommendation Workshop with Decision Trees and Deep Learning☆14Jun 5, 2019Updated 6 years ago
- Indonesian Manually Tagged Corpus☆92Jul 5, 2022Updated 3 years ago
- SCT: An Efficient Self-Supervised Cross-View Training For Sentence Embedding (TACL)☆16Jul 27, 2024Updated last year
- freely reusable language resources for Myanmar☆24Dec 30, 2015Updated 10 years ago
- Wiktra - Python tool of Wiktionary Transliteration modules for 514 languages and its 102 different scripts (orthographies)☆36Jun 29, 2025Updated 8 months ago
- Pretraining scripts for BART transformer model☆12May 15, 2023Updated 2 years ago
- ☆17Dec 12, 2024Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A comprehensive evaluation framework for the SEA region☆19Mar 4, 2026Updated 3 weeks ago
- A package for handy processing of semantic graphs such as AMR, with a special focus on standardized evaluation☆26May 1, 2025Updated 10 months ago
- python package for unsupervised text segmentation.☆14Oct 31, 2016Updated 9 years ago
- ☆40Feb 1, 2023Updated 3 years ago
- 生成训练文本检测数据集☆12Jul 1, 2020Updated 5 years ago
- Curated list of publicly available parallel corpus for Indian Languages☆37Jul 15, 2021Updated 4 years ago
- End-to-end integration of HuggingFace's models for sequence labeling.☆11Oct 4, 2020Updated 5 years ago