share data, prompt data , pretraining data
☆36Nov 30, 2023Updated 2 years ago
Alternatives and similar repositories for aigc_data
Users that are interested in aigc_data are comparing it to the libraries listed below
Sorting:
- realize the reinforcement learning training for gpt2 llama bloom and so on llm model☆27Sep 19, 2023Updated 2 years ago
- rwkv finetuning☆37Apr 22, 2024Updated last year
- everyone_can_pretrain_language_model☆25Jan 13, 2021Updated 5 years ago
- ☆32Jun 5, 2025Updated 9 months ago
- deep learning☆151May 6, 2025Updated 10 months ago
- bilibili-nlp☆30Sep 24, 2022Updated 3 years ago
- 本项目由三个模块构成。意图识别:判断用户的意图是业务型还是闲聊型;模型检索:该部分构建一个语料库,当用户 发起新的query(通过意图识别判断为业务型对话)时,为用户匹配query检索的最佳response,使用HSWN进行召回(粗排), 然后构建句子的相似度,并利用Lig…☆12Feb 18, 2021Updated 5 years ago
- This repository contains the data used for the paper "Entity Recognition at First Sight: Improving NER with Eye Movement Information" by …☆11Jan 22, 2020Updated 6 years ago
- ☆16May 31, 2024Updated last year
- chatglm 6b finetuning and alpaca finetuning☆1,537Mar 9, 2025Updated last year
- Python client designed specifically for large-scale requests to the openai interface☆23Feb 29, 2024Updated 2 years ago
- SMP2018中文人机对话技术评测(ECDT)☆15Oct 25, 2018Updated 7 years ago
- Fine-tuning RWKV-World model☆26Jun 6, 2023Updated 2 years ago
- chinese few-shot ner☆16Aug 28, 2022Updated 3 years ago
- qwen models finetuning☆107Mar 9, 2025Updated last year
- ☆81May 15, 2024Updated last year
- 对ChatGLM直接使用RLHF提升或降低目标输出概率|Modify ChatGLM output with only RLHF☆198May 23, 2023Updated 2 years ago
- 针对Cnews数据集进行分类,使用了torchtext进行文本预处理☆11Sep 16, 2022Updated 3 years ago
- 使用Qwen1.5-0.5B-Chat模型进行通用信息抽取任务的微调,旨在: 验证生成式方法相较于抽取式NER的效果; 为新手提供简易的模型微调流程,尽量减少代码量; 大模型训练的数据格式处理。☆15Sep 6, 2024Updated last year
- A new way to generate large quantities of high quality synthetic data (on par with GPT-4), with better controllability, at a fraction of …☆23Oct 1, 2024Updated last year
- Latin texts annotated for named entities and NER tagger used for the Herodotos Project (Ohio State University / Ghent University)☆11Sep 26, 2022Updated 3 years ago
- 异步语音对话组件。☆32Mar 13, 2025Updated last year
- 1.4亿通用知识图谱问答☆18Aug 10, 2020Updated 5 years ago
- A Slot-filling based Dialog Manager for Task-oriented Bot☆12Dec 29, 2016Updated 9 years ago
- ☆15Dec 22, 2017Updated 8 years ago
- MIT6.S081实验记录,并且利用Docker+code-server(网页版Vscode)进行环境搭建,实现开箱即用的纯净实验环境,具体使用说明请看下面的网站☆12Jan 28, 2024Updated 2 years ago
- A TensorFlow implementation of "QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension"☆31Jun 2, 2018Updated 7 years ago
- GPT-SoVITS api for v3 version☆14Mar 5, 2025Updated last year
- 中英文敏感词、语言检测、中外手机/电话归属地/运营商查询、名字推断性别、手机号抽取、身份证抽取、邮箱抽取、中日文人名库、中文缩写库、拆字词典、词汇情感值、停用词、反动词表、暴恐词表、繁简体转换、英文模拟中文发音、汪峰歌词生成器、职业名称词库、同义词库、反义词库、否定词库、汽…☆16May 6, 2020Updated 5 years ago
- Automatic prompt optimization framework for multi-step agent tasks.☆37Nov 12, 2024Updated last year
- 一个基于HuggingFace开发的大语言模型训练、测试工具。支持各模型的webui、终端预测,低参数量及全参数模型训练(预训练、SFT、RM、PPO、DPO)和融合、量化。☆223Dec 8, 2023Updated 2 years ago
- 基于电商数据微调的Qwen2.5系列的电商大模型,电商数据sft后电商大模型。是https://github.com/leeguandong/EcommerceLLM的升级版本。qwen2.5的效果很好。☆13Oct 4, 2024Updated last year
- ☆14Dec 26, 2022Updated 3 years ago
- 该仓库主要记录 NLP 算法工程师相关的 搜索引擎 学习笔记☆13Apr 9, 2022Updated 3 years ago
- 文本规则提取工具☆22Feb 20, 2026Updated last month
- ☆23Jul 17, 2023Updated 2 years ago
- ☆14Aug 26, 2024Updated last year
- 答题专题,Vue开发☆11Jun 27, 2017Updated 8 years ago
- Files from the published Alpha Star paper by DeepMind☆18Nov 14, 2019Updated 6 years ago