zinccat / dolly_chineseLinks
Translation of the databricks-dolly-15k dataset to Chinese for commercial use.
☆19Updated 2 years ago
Alternatives and similar repositories for dolly_chinese
Users that are interested in dolly_chinese are comparing it to the libraries listed below
Sorting:
- MOSS 003 WebSearchTool: A simple but reliable implementation☆45Updated 2 years ago
- MultilingualShareGPT, the free multi-language corpus for LLM training☆73Updated 2 years ago
- backend for fastnlp MOSS project☆59Updated last year
- Gaokao Benchmark for AI☆109Updated 3 years ago
- ROUGE for multilingual Summarization☆25Updated 4 years ago
- GAOGAO-Bench-Updates is a supplement to the GAOKAO-Bench, a dataset to evaluate large language models.☆37Updated 11 months ago
- A unified tokenization tool for Images, Chinese and English.☆153Updated 2 years ago
- ☆220Updated 3 years ago
- [LREC] MMChat: Multi-Modal Chat Dataset on Social Media☆108Updated 3 years ago
- OPD: Chinese Open-Domain Pre-trained Dialogue Model☆75Updated 2 years ago
- 大规模中文语料☆44Updated 6 years ago
- Kanchil(鼷鹿)是世界上最小的偶蹄目动物,这个开源项目意在探索小模型(6B以下)是否也能具备和人类偏好对齐的能力。☆113Updated 2 years ago
- 逻辑回归和单层softmax的解析解☆12Updated 4 years ago
- A preliminary evaluation of ChatGPT/GPT-4 for machine translation.☆248Updated 8 months ago
- ☆59Updated 2 years ago
- 香侬科技(北京香侬慧语科技有限责任公司)知乎爆料备份☆43Updated 5 years ago
- 中文大语言模型评测第一期☆110Updated 2 years ago
- Latest Evaluation Toolkit (LatestEval). Assessing the language models with latest, uncontaminated materials.☆27Updated 10 months ago
- ☆22Updated 2 years ago
- 中文图书语料MD5链接☆218Updated last year
- OpenLLMDE: An open source data engineering framework for LLMs☆18Updated 2 years ago
- This repository is the official implementation of our EMNLP 2022 paper ELMER: A Non-Autoregressive Pre-trained Language Model for Efficie…☆26Updated 3 years ago
- 本项目收集目前对话系统论文中,已公开的,用于训练中(英)文的训练集。Datasets for training Dialog.☆22Updated 6 years ago
- Introduction to CPM☆165Updated 4 years ago
- Inspired by google c4, here is a series of colossal clean data cleaning scripts focused on CommonCrawl data processing. Including Chinese…☆134Updated 2 years ago
- ChatGLM-6B-Slim:裁减掉20K图片Token的ChatGLM-6B,完全一样的性能,占用更小的显存。☆127Updated 2 years ago
- Awesome Reinforcement Learning from Human Feedback, the secret behind ChatGPT XD☆23Updated 3 years ago
- JAX implementation of the bart-base model☆34Updated 2 years ago
- a Fine-tuned LLaMA that is Good at Arithmetic Tasks☆178Updated 2 years ago
- Calculate the probability of a paper being accepted by EMNLP2023 based on score distribution of ACL2023.☆14Updated 2 years ago