NiuTrans / LMTLinks
Building a inclusive, scalable, and high-performance multilingual translation model
☆120Updated 2 weeks ago
Alternatives and similar repositories for LMT
Users that are interested in LMT are comparing it to the libraries listed below
Sorting:
- An introduction to basic concepts of Transformers and key techniques of their recent advances.☆51Updated 2 years ago
- ☆96Updated 2 years ago
- This repository provides an implementation of "A Simple yet Effective Training-free Prompt-free Approach to Chinese Spelling Correction B…☆86Updated 7 months ago
- 本项目旨在对大量文本文件进行快速编码检测和转换以辅助mnbvc语料集项目的数据清洗工作☆69Updated 3 months ago
- We present a list of languages with their codes, families, regions and etc. We also present a list of multi-lingual corpora (with urls).☆87Updated 4 years ago
- “百聆”是一个基于LLaMA的语言对齐增强的英语/中文大语言模型,具有优越的英语/中文能力,在多语言和通用任务等多项测试中取得ChatGPT 90%的性能。BayLing is an English/Chinese LLM equipped with advanced l…☆319Updated last year
- 文本去重☆77Updated last year
- Beyond Decoder-only: Large Language Models Can be Good Encoders for Machine Translation☆28Updated 7 months ago
- A wide variety of research projects developed by the SpokenNLP team of Speech Lab, Alibaba Group.☆124Updated 8 months ago
- [LREC] MMChat: Multi-Modal Chat Dataset on Social Media☆108Updated 3 years ago
- The ParroT framework to enhance and regulate the Translation Abilities during Chat based on open-sourced LLMs (e.g., LLaMA-7b, Bloomz-7b1…☆177Updated last year
- 更纯粹、更高压缩率的Tokenizer☆490Updated last year
- Dataset and evaluation script for "Evaluating Hallucinations in Chinese Large Language Models"☆136Updated last year
- CINO: Pre-trained Language Models for Chinese Minority (少数民族语言预训练模型)☆260Updated 6 months ago
- [ACL'24] MC^2: A Multilingual Corpus of Minority Languages in China (Tibetan, Uyghur, Kazakh, and Mongolian)☆29Updated 3 weeks ago
- A Fast Neural Machine Translation System developed in C++.☆146Updated last year
- CLongEval: A Chinese Benchmark for Evaluating Long-Context Large Language Models☆48Updated last year
- The Corpus & Code for EMNLP 2022 paper "FCGEC: Fine-Grained Corpus for Chinese Grammatical Error Correction" | FCGEC中 文语法纠错语料及STG模型☆120Updated last year
- ☆254Updated last year
- Efficient, Low-Resource, Distributed transformer implementation based on BMTrain☆266Updated 2 years ago
- Model Compression for Big Models☆167Updated 2 years ago
- ☆35Updated 2 years ago
- ☆184Updated 2 years ago
- ☆161Updated 5 months ago
- ☆78Updated 2 years ago
- [ACL 2024 Demo] Official GitHub repo for UltraEval: An open source framework for evaluating foundation models.☆255Updated last year
- [ACL'2024 Findings] GAOKAO-MM: A Chinese Human-Level Benchmark for Multimodal Models Evaluation☆76Updated last year
- Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation☆90Updated last year
- NiuTrans.SMT is an open-source statistical machine translation system developed by a joint team from NLP Lab. at Northeastern University …☆162Updated last year
- A list of conferences and journals relevant to machine translation☆33Updated 3 years ago