shuliu586/AI_Chinese_DataSet_KnowledgeDAO

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/shuliu586/AI_Chinese_DataSet_KnowledgeDAO)

shuliu586 / AI_Chinese_DataSet_KnowledgeDAO

供AI训练的中文数据集（持续更新。。。）与AI公司图谱，目前的数据集餐饮行业8000问，百度知道，Alpaca中文数据集，计算机领域数据集，Vicuna数据集，RedPajama数据集，Wikipedia中文词条数据集，网站论坛问答数据集

☆66

Alternatives and similar repositories for AI_Chinese_DataSet_KnowledgeDAO

Users that are interested in AI_Chinese_DataSet_KnowledgeDAO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

asd5510 / fastText-chinese-word2vec-optimization
View on GitHub
fastText中文词向量训练调优，加权融合字向量和词向量，解决过度表征字面量而非语义的问题
☆12Aug 3, 2020Updated 5 years ago
EyreFree / WangCuoS
View on GitHub
一个美女脱衣小游戏
☆13Feb 11, 2017Updated 9 years ago
XiPotatonium / LAVIS
View on GitHub
LAVIS - A One-stop Library for Language-Vision Intelligence
☆10Apr 18, 2023Updated 3 years ago
Datastory-CN / DataStoryLLMBenchmark
View on GitHub
☆11Oct 13, 2023Updated 2 years ago
tongji40 / MGTV_AI_Challenge_Anti_Stealing_Link_Rank_10th
View on GitHub
2021年芒果TV第二届“马栏山杯”国际音视频算法大赛防盗链第10名
☆17May 24, 2021Updated 5 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
Datastory-CN / ASQP-Datasets
View on GitHub
☆16Aug 23, 2023Updated 2 years ago
libeineu / UMST
View on GitHub
☆11Jun 1, 2023Updated 3 years ago
nlpformyself / rc_tf
View on GitHub
我的百度机器阅读理解竞赛模型代码，获得 final 第三名
☆14Jul 26, 2018Updated 8 years ago
15810856129 / Simhash
View on GitHub
使用Simhash对海量文本进行去重
☆12Jun 2, 2018Updated 8 years ago
kwaziidev / textractor
View on GitHub
从html中提取正文,用于新闻类网页
☆15Feb 24, 2023Updated 3 years ago
ZhangYiBo513 / Simhash-
View on GitHub
基于谷歌大规模网页去重simhash算法，对海量文章（长文本）进行去重。
☆11Dec 8, 2022Updated 3 years ago
jiangnanboy / llm_corpus_quality
View on GitHub
大模型预训练中文语料清洗及质量评估 Large model pre-training corpus cleaning
☆80Jul 25, 2024Updated 2 years ago
wooplevip / sedis
View on GitHub
SQL for Redis
☆11Sep 16, 2022Updated 3 years ago
QYHcrossover / rl-tennis
View on GitHub
ppo+action mask for atari tennis agent
☆12Mar 2, 2023Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
germinator / flume-jdbc
View on GitHub
My branch of Apache Flume with a generic JDBC sink (not yet licensed to Apache)
☆11Feb 12, 2022Updated 4 years ago
alaam / keylines-demo
View on GitHub
A demo repo for using keylines with IBM Graph
☆11Dec 20, 2016Updated 9 years ago
domdanrtsey / orawatch
View on GitHub
☆13Dec 23, 2020Updated 5 years ago
hhnqqq / GemmaLongText
View on GitHub
☆15Apr 7, 2024Updated 2 years ago
Xianchao-Wu / wekws
View on GitHub
Production First and Production Ready End-to-End Keyword Spotting Toolkit
☆12May 30, 2022Updated 4 years ago
mt-upc / transformer-contributions-nmt
View on GitHub
☆18Oct 6, 2022Updated 3 years ago
dengwentao99 / SLJA
View on GitHub
☆22May 22, 2024Updated 2 years ago
datawhalechina / accessible-workflow
View on GitHub
☆11Dec 21, 2024Updated last year
beamline / framework
View on GitHub
The Beamline streaming process mining framework
☆14Oct 5, 2023Updated 2 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
helloJamest / Personalization-recommendation
View on GitHub
a project about Personalization recommendation(UserCF,itemCF,LFM,Personal Rank)
☆18Sep 20, 2020Updated 5 years ago
openfeedback / superhf
View on GitHub
Open-source Human Feedback Library
☆11Oct 25, 2023Updated 2 years ago
eastzq / simpledatax-service
View on GitHub
将datax改成一个java服务包，支持单jvm同时运行多个实例，整合调度代码，提供api调用接口，可以直接集成到已有服务框架。
☆38Dec 6, 2022Updated 3 years ago
rmetzger / flink-community-tools
View on GitHub
☆14Nov 16, 2022Updated 3 years ago
daily-demos / rtvi-client-android-demo
View on GitHub
☆12Dec 11, 2024Updated last year
hiyoung123 / DuplicateRemove
View on GitHub
基于simhash的文本去重算法
☆20Jun 18, 2021Updated 5 years ago
wwcxjun / tanbaiyan
View on GitHub
坦白言小程序后端
☆12Dec 3, 2019Updated 6 years ago
kingking888 / CommNewsExtractor
View on GitHub
通用文章提取，正文，标题，时间，作者，图片，音视频，联系方式等
☆23Mar 19, 2023Updated 3 years ago
BunsenFeng / botsay
View on GitHub
What does the bot say? ACL 2024
☆28Aug 27, 2024Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
realMoana / ProxyExplainer
View on GitHub
ProxyExplainer for Graph Neural Networks
☆16Oct 24, 2024Updated last year
smallyunet / activiti-desginer-jquery
View on GitHub
activiti6 designer 网页版，activiti 工作流在线编辑器（jquery）
☆14Jul 22, 2020Updated 6 years ago
zhengdaoli / ChatGptAPITools
View on GitHub
使用ChatGPT的api进行各种工作生活的辅助工具集合，已经包含调教好的提示词，直接调用函数即可，无需再调教。主要工具包括：新闻，文章总结，英语翻译和改进，论文润色等等
☆10Mar 19, 2023Updated 3 years ago
usail-hkust / benchmark_inference_time_computation_LLM
View on GitHub
[NeurIPS 2025] Bag of Tricks for Inference-time Computation of LLM Reasoning
☆16Sep 20, 2025Updated 10 months ago
ken-xue / pipeline
View on GitHub
轻量级的流程编排执行引擎
☆13May 2, 2022Updated 4 years ago
qindongliang / hbase-increment-index
View on GitHub
hbase+solr实现hbase的二级索引
☆46Jul 12, 2026Updated 2 weeks ago
renatoathaydes / osgiaas
View on GitHub
OSGiaaS - OSGi as a Service
☆17Jun 6, 2018Updated 8 years ago