alibaba-damo-academy / SpokenNLP
A wide variety of research projects developed by the SpokenNLP team of Speech Lab, Alibaba Group.
☆110Updated 9 months ago
Related projects ⓘ
Alternatives and complementary repositories for SpokenNLP
- The Corpus & Code for EMNLP 2022 paper "FCGEC: Fine-Grained Corpus for Chinese Grammatical Error Correction" | FCGEC中文语法纠错语料及STG模型☆107Updated 3 months ago
- Code & data for our EMNLP2022 paper "SynGEC: Syntax-Enhanced Grammatical Error Correction with a Tailored GEC-Oriented Parser"☆80Updated 8 months ago
- Summarize all open source Large Languages Models and low-cost replication methods for Chatgpt.☆135Updated last year
- 文本去重☆67Updated 6 months ago
- Code & Data for our Paper "NaSGEC: Multi-Domain Chinese Grammatical Error Correction for Native Speaker Texts" (ACL 2023 Findings)☆76Updated last year
- 中文 Instruction tuning datasets☆118Updated 7 months ago
- 多轮共情对话模型PICA☆86Updated last year
- The complete training code of the open-source high-performance Llama model, including the full process from pre-training to RLHF.☆62Updated last year
- Rephrasing Language Model for CSC (AAAI 2024)☆37Updated 6 months ago
- ☆30Updated last year
- ☆52Updated 9 months ago
- text embedding☆140Updated last year
- code for piccolo embedding model from SenseTime☆112Updated 6 months ago
- 百川Dynamic NTK-ALiBi的代码实现:无需微调即可推理更长文本☆46Updated last year
- 使用sentencepiece中BPE训练中文词表,并在transformers中进行使用。☆111Updated last year
- NLU & NLG (zero-shot) depend on mengzi-t5-base-mt pretrained model☆75Updated 2 years ago
- Source code for ACL 2023 paper Decoder Tuning: Efficient Language Understanding as Decoding☆48Updated last year
- ☆46Updated 11 months ago
- CCL 2022 汉语学习者文本纠错评测☆135Updated last year
- Correcting Chinese Spelling Errors with Phonetic Pre-training 非官方实现☆38Updated 2 years ago
- Python ROUGE Score Implementation for Chinese Language Task (official rouge score)☆82Updated 4 months ago
- OPD: Chinese Open-Domain Pre-trained Dialogue Model☆74Updated last year
- code and data for "CSCD-NS: a Chinese Spelling Check Dataset for Native Speakers"☆57Updated 3 months ago
- 1.4B sLLM for Chinese and English - HammerLLM🔨☆43Updated 7 months ago
- 中文图书语料MD5链接☆212Updated 9 months ago
- Dataset and evaluation script for "Evaluating Hallucinations in Chinese Large Language Models"☆109Updated 5 months ago
- 中文机器阅读理解数据集☆100Updated 3 years ago
- The code for our ACL2022 findings paper: CRACSpell: A Contextual Typo Robust Approach with Copy Mechanism to Improve Chinese Spelling Cor…☆74Updated 2 years ago
- ☆173Updated last year
- 大规模中文语料☆38Updated 5 years ago