Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures Translation
☆15Aug 27, 2024Updated last year
Alternatives and similar repositories for CourseraParallelCorpusMining
Users that are interested in CourseraParallelCorpusMining are comparing it to the libraries listed below
Sorting:
- Scripts for creating a Japanese-English parallel corpus and training NMT models☆18Nov 9, 2021Updated 4 years ago
- A parallel evaluation data set of SAP software documentation with document structure annotation☆14Jul 30, 2025Updated 7 months ago
- NIILC QA data☆18Nov 20, 2015Updated 10 years ago
- ☆13Dec 11, 2020Updated 5 years ago
- Unsupervised parallel sentence extraction from comparable corpora☆16Aug 6, 2019Updated 6 years ago
- Efficient teacher-student models and scripts to make them☆54Dec 16, 2023Updated 2 years ago
- A Japanese dependency parser based on BERT☆23Oct 26, 2022Updated 3 years ago
- python版日本語意味役割付与システム(ASA)☆22Nov 11, 2022Updated 3 years ago
- 首都大日本語 Twitter コーパス☆21Mar 14, 2016Updated 9 years ago
- A library of translation-based text similarity measures☆25Dec 11, 2023Updated 2 years ago
- Japanese Movie Recommendation Dialogue dataset☆29Jul 19, 2022Updated 3 years ago
- Тут находится страница курса алгоритмов yandexdataschool.ru☆28Feb 22, 2026Updated last week
- Yet another Python binding for Juman++/KNP/KWJA☆38Updated this week
- Bilingual sengence aligner☆28Nov 25, 2025Updated 3 months ago
- ☆35Dec 17, 2020Updated 5 years ago
- ☘️ Code for Convex Aggregation for Opinion Summarization (Iso et al; Findings of EMNLP 2021)☆35Dec 22, 2022Updated 3 years ago
- A summarizer for Japanese articles (but ChatGPT is better)☆10Aug 1, 2022Updated 3 years ago
- Использование инструмента Draw.io для создания схем Terraform развертываний.☆10Dec 18, 2025Updated 2 months ago
- Automated Continuous Data Quality Measurement☆12Nov 15, 2023Updated 2 years ago
- Transformers at any scale☆42Jan 18, 2024Updated 2 years ago
- Machine translation (MT) benchmark dataset for languages in the Horn of Africa.☆42Oct 13, 2022Updated 3 years ago
- XVERSE-MoE-A36B: A multilingual large language model developed by XVERSE Technology Inc.☆39Sep 12, 2024Updated last year
- A High-Quality Multilingual Dataset for Structured Documentation Translation☆37May 1, 2025Updated 10 months ago
- A large parallel corpus of English and Japanese☆87Nov 1, 2017Updated 8 years ago
- A deep learning based application which is entitled to help the visually impaired people. The application automatically generates the tex…☆12Oct 2, 2020Updated 5 years ago
- ☆10May 5, 2022Updated 3 years ago
- ATC-Anno is an annotation tool for Air Traffic Control data that offers automatic semantic and concept annotation.☆12Nov 17, 2023Updated 2 years ago
- Unofficial ontologies for Official Registers of Russian Federal Tax Service☆10Apr 7, 2018Updated 7 years ago
- COMET for African languages☆10Jan 24, 2025Updated last year
- RWKV is a RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best …☆10Nov 3, 2023Updated 2 years ago
- La plateforme derrière nous le peuple. Fork de Pligg.☆10Sep 29, 2015Updated 10 years ago
- RemindMe is a reminder and task-management app designed to help you stay organised and on top of your to-do list.☆16Apr 5, 2024Updated last year
- Node.js wrapper for the GLTF2Loader library from Three.js☆10Nov 8, 2017Updated 8 years ago
- My OpenCode and Oh-My-OpenCode configuration files with API proxy setup documentation☆32Jan 5, 2026Updated last month
- UnrealEngine5版VOICEVOX Engine☆13Nov 29, 2025Updated 3 months ago
- Super simple, zero config options, <2kb declarative tooltip library with no dependencies.☆17Jun 2, 2023Updated 2 years ago
- Воркшоп «Agile Mindset в проектировании информационных и производственных систем» 32hrs☆13Nov 3, 2023Updated 2 years ago
- Kyoto University Web Document Leads Corpus☆83Dec 18, 2023Updated 2 years ago
- Japanese BERT trained on Aozora Bunko and Wikipedia, pre-tokenized by MeCab with UniDic & SudachiPy☆40Aug 8, 2020Updated 5 years ago