NiuTrans / LaMaTELinks
Beyond Decoder-only: Large Language Models Can be Good Encoders for Machine Translation
☆22Updated 2 months ago
Alternatives and similar repositories for LaMaTE
Users that are interested in LaMaTE are comparing it to the libraries listed below
Sorting:
- A list of conferences and journals relevant to machine translation☆33Updated 3 years ago
- 基于LLaMA2-7B增量预训练的藏文大 语言模型TiLamb(Tibetan Large Language Model Base)☆23Updated last year
- An introduction to basic concepts of Transformers and key techniques of their recent advances.☆49Updated last year
- Code & data for our EMNLP2022 paper "SynGEC: Syntax-Enhanced Grammatical Error Correction with a Tailored GEC-Oriented Parser"☆84Updated last year
- OMGEval😮: An Open Multilingual Generative Evaluation Benchmark for Foundation Models☆33Updated 10 months ago
- ☆14Updated last year
- LaTeX Thesis Template for Beijing Language and Culture University☆15Updated last month
- an easy-to-use knn-mt toolkit☆104Updated last year
- ☆176Updated 10 months ago
- We present a list of languages with their codes, families, regions and etc. We also present a list of multi-lingual corpora (with urls).☆85Updated 4 years ago
- Yet Another Chinese Learner Corpus☆77Updated 3 years ago
- The repository of EMNLP 2023 "A Frustratingly Easy Plug-and-Play Detection-and-Reasoning Module for Chinese Spelling Check"☆17Updated last year
- ☆78Updated 9 months ago
- Yet Another Chinese Spelling Check Dataset (YACSC)☆19Updated last year
- code for Teaching LM to Translate with Comparison☆39Updated last year
- ☆11Updated last month
- A repository used to organize content related to Large Speech(Audio) Model, including paper, data, applications, tools and so on.☆18Updated 4 months ago
- Dataset and evaluation script for "Evaluating Hallucinations in Chinese Large Language Models"☆128Updated last year
- ☆75Updated 5 months ago
- A wide variety of research projects developed by the SpokenNLP team of Speech Lab, Alibaba Group.☆117Updated last week
- ☆40Updated last year
- ☆16Updated last year
- A retrieval augmented sequence modeling toolkit implemented based on Fairseq☆30Updated 2 years ago
- Source code for our EMNLP 2022 paper "Wait-info Policy: Balancing Source and Target at Information Level for Simultaneous Machine Transla…☆7Updated 2 years ago
- code and data for "CSCD-NS: a Chinese Spelling Check Dataset for Native Speakers"☆69Updated 9 months ago
- Code and data of the paper "MCTS: A Multi-Reference Chinese Text Simplification Dataset".☆31Updated last year
- The Corpus & Code for EMNLP 2022 paper "FCGEC: Fine-Grained Corpus for Chinese Grammatical Error Correction" | FCGEC中文语法纠错语料及STG模型☆117Updated 5 months ago
- LongMIT: Essential Factors in Crafting Effective Long Context Multi-Hop Instruction Datasets☆37Updated 8 months ago
- 中文 Instruction tuning datasets☆131Updated last year
- ☆120Updated 3 years ago