深入探索大型语言模型(LLM)的世界,本项目汇集了跨越五个关键维度的代表性文本数据集——预训练语料库、微调指令数据集、偏好数据集、评估数据集、传统NLP数据集及多模态数据集。我们致力于为研究者和开发者提供最全面的资源,以推动人工智能技术的发展和应用。
☆20Apr 26, 2024Updated 2 years ago
Alternatives and similar repositories for AwesomeLLMsDatasets
Users that are interested in AwesomeLLMsDatasets are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The latest progress of Personalized Large Language Models (LLMs).☆48Jun 9, 2026Updated last week
- TOD-Flow: Modeling the Structure of Task-Oriented Dialogues☆13Feb 7, 2024Updated 2 years ago
- Data for evaluating GPT-4V☆11Oct 26, 2023Updated 2 years ago
- ☆17Feb 2, 2024Updated 2 years ago
- ☆11Jun 11, 2024Updated 2 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- OPSTL: Self-supervised Skeleton-based Action Recognition in Occluded Environments☆14Oct 25, 2023Updated 2 years ago
- Build LLM Application with Local Documents☆20Jun 13, 2025Updated last year
- Open-source code for GEAR☆16Dec 3, 2025Updated 6 months ago
- The code and data for the paper "Lost-in-the-Middle in Long-Text Generation: Synthetic Dataset, Evaluation Framework, and Mitigation"☆14Oct 8, 2025Updated 8 months ago
- chatgpt库的调用, 支持流式和非流式,可以实现类似官方chatgpt的显示效果☆11Jan 30, 2024Updated 2 years ago
- 校园音乐征集投票系统 A system for electing annual school music☆10Jun 8, 2026Updated last week
- ☆13Nov 10, 2022Updated 3 years ago
- [ECCV'24] UNIT: Backdoor Mitigation via Automated Neural Distribution Tightening☆10Dec 18, 2025Updated 6 months ago
- 量化交易网站,软工三大作业迭代三,团队项目☆11Mar 8, 2018Updated 8 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- An Efficent BPE Algorithm Faster then Hugging Face Tokenizer's Implementation☆13Sep 9, 2024Updated last year
- NewsApp包含客户端源码、服务端源码、数据库文件。 基于Miscrosoft人工智能项目ProjectOxford中的Recognition Emotion做的, 主要是基于用户的面部表情来推送不同类别的新闻。 Emotion API可以参 考:https://www.p…☆10Mar 2, 2016Updated 10 years ago
- Execute a command in the context of the desktop user☆13Aug 23, 2021Updated 4 years ago
- 基于 BPE 实现的中文分词。优化:预处理,并行计算,多字词,多词表☆14May 14, 2022Updated 4 years ago
- 基于MFCC特征构建单核GMM的0-9独立词语音识别,MFCC,GMM,sklearn,Isolated word recognition。☆10Nov 18, 2020Updated 5 years ago
- ☆34Jan 9, 2026Updated 5 months ago
- ☆14Apr 1, 2023Updated 3 years ago
- 爬取雨课堂答案☆16Nov 21, 2024Updated last year
- [EMNLP 2024] ”ESC-Eval: Evaluating Emotion Support Conversations in Large Language Models“☆27Jun 24, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- 监控哔哩哔哩直播间数据,实时保存至数据库,并在内置网页上查看精致的可视化统计图表。☆13Jan 4, 2022Updated 4 years ago
- ☆13Apr 7, 2022Updated 4 years ago
- Generate Game Character for animation (SSD)☆36Mar 16, 2025Updated last year
- Training neural networks for inverse design of nanophotonic gratings.☆22Dec 15, 2021Updated 4 years ago
- 2024-2025下半学年人工智能导论(拔尖班)☆17Jun 16, 2025Updated last year
- python爬取股市数据,并对各个行业股票行情、财务数据进行重构分析☆10Jul 26, 2020Updated 5 years ago
- ☆15May 1, 2025Updated last year
- Official implementation of ECCV 2024 paper: Take A Step Back: Rethinking the Two Stages in Visual Reasoning☆13Jun 1, 2025Updated last year
- Um site dinâmico desenvolvido em Next.js, utilizando tecnologias como Tailwind CSS e diversas bibliotecas de UI do Radix UI e Shadcn UI. …☆20Sep 24, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- 这是一个大学生互联网+的大创项目:“一点到家”——云滇家政平台助力乡村振兴,系统前台:微信小程序,后端springboot,数据库mysql。属于一个非常值得推荐的项目,系统源码简单宜读,干净简洁、注释详细,可二次开发。创意满满,贴近生活,缓解就业压力,为农民增收致富,促进…☆14Jun 17, 2023Updated 3 years ago
- Python package for temporal evolution of initial conditions under the generalized Lugiato-Lefever equation☆19Sep 22, 2022Updated 3 years ago
- 「城语」APP基于A级景区、历史古迹、文物保护单位等基础数据,利用先进的大模型能力实现智能化的Citywalk 路线规划 ,包括设计一条路线、生成路线攻略、生成景点的推荐理由等三大核心功能;利用大模型减少了人工编辑和推荐的工作量,并可以根据游客的需求进行个性化定制,提升了游客…☆19Feb 20, 2024Updated 2 years ago
- Aurora forecasts created from solar wind data (OVATION Prime 2010)☆20Apr 11, 2025Updated last year
- Integrating Large Weather Models with Data Assimilation☆25Jun 2, 2024Updated 2 years ago
- Complete Reinforcement Learning Toolkit for Large Language Models!☆21Aug 2, 2025Updated 10 months ago
- Port of Andrej Karpathy's minbpe to Rust☆32May 6, 2024Updated 2 years ago