jiaohuix / nmt_data_toolsLinks
machine translation data process tools
☆10Updated last year
Alternatives and similar repositories for nmt_data_tools
Users that are interested in nmt_data_tools are comparing it to the libraries listed below
Sorting:
- 1.4B sLLM for Chinese and English - HammerLLM🔨☆44Updated last year
- Code for Findings of ACL 2023 paper "Improving Zero-shot Multilingual Neural Machine Translation by Leveraging Cross-lingual Consistency …☆10Updated 2 years ago
- ☆35Updated 2 years ago
- ☆40Updated last year
- The repo of "Improving Seq2Seq Grammatical Error Correction via Decoding Interventions"☆30Updated last year
- 🩺 A collection of ChatGPT evaluation reports on various bechmarks.☆50Updated 2 years ago
- ACL2023 (Oral): TemplateGEC: Improving Grammatical Error Correction with Detection Template☆22Updated 2 years ago
- A wide variety of research projects developed by the SpokenNLP team of Speech Lab, Alibaba Group.☆118Updated 4 months ago
- ☆99Updated 3 years ago
- OPD: Chinese Open-Domain Pre-trained Dialogue Model☆75Updated 2 years ago
- ☆25Updated 5 months ago
- [NeurIPS 2023] Repetition In Repetition Out: Towards Understanding Neural Text Degeneration from the Data Perspective☆36Updated 2 years ago
- ROUGE for multilingual Summarization☆25Updated 4 years ago
- A collection of instruction data and scripts for machine translation.☆20Updated 2 years ago
- This is a personal reimplementation of Google's Infini-transformer, utilizing a small 2b model. The project includes both model and train…☆58Updated last year
- 科大讯飞低资源多语种文本翻译挑战赛获奖方案☆29Updated 2 years ago
- ☆59Updated 2 years ago
- Official completion of “Training on the Benchmark Is Not All You Need”.☆37Updated 9 months ago
- ☆53Updated 3 years ago
- Tools for formatting WMT hypothesis and test sets in XML☆27Updated 6 months ago
- Code & Data for our Paper "RobustGEC: Robust Grammatical Error Correction Against Subtle Context Perturbation" (EMNLP 2023)☆17Updated last year
- Code for ICML 25 paper "Metadata Conditioning Accelerates Language Model Pre-training (MeCo)"☆44Updated 3 months ago
- [ICML'2024] Can AI Assistants Know What They Don't Know?☆83Updated last year
- [EMNLP 2023] C-STS: Conditional Semantic Textual Similarity☆73Updated last year
- The implementation for our paper, "Improving Simultaneous Machine Translation with Monolingual Data," accepted to AAAI 2023. 🎉☆12Updated 2 years ago
- Code for embedding and retrieval research.☆17Updated last year
- The Corpus & Code for EMNLP 2022 paper "FCGEC: Fine-Grained Corpus for Chinese Grammatical Error Correction" | FCGEC中文语法纠错语料及STG模型☆119Updated 10 months ago
- This repository open-sources our GEC system submitted by THU KELab (sz) in the CCL2023-CLTC Track 1: Multidimensional Chinese Learner Tex…☆15Updated last year
- We systematically studied the influencing factors when LLM generates benchmarks,By using our code, you can generate high-quality QA datas…☆20Updated 4 months ago
- This is the official implementation of the paper: "Contrastive Learning of Sentence Embeddings from Scratch"☆39Updated 2 years ago