☆364Jun 13, 2024Updated last year
Alternatives and similar repositories for FlagData
Users that are interested in FlagData are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆185Nov 13, 2023Updated 2 years ago
- Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷☆6,234Updated this week
- MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志…☆4,157Mar 22, 2026Updated 2 weeks ago
- Firefly: 大模型训练工具,支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、…☆6,653Oct 24, 2024Updated last year
- 文本去重☆78May 23, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensible toolkit for large-scale model.☆3,878Nov 11, 2025Updated 4 months ago
- FlagEval is an evaluation toolkit for AI large foundation models.☆337Apr 24, 2025Updated 11 months ago
- [ACL 2024] IEPile: A Large-Scale Information Extraction Corpus☆212Jan 9, 2025Updated last year
- TigerBot: A multi-language multi-task LLM☆2,266Dec 28, 2024Updated last year
- BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)☆8,289Oct 16, 2024Updated last year
- ☆979Feb 7, 2025Updated last year
- Retrieval and Retrieval-augmented LLMs☆11,502Apr 1, 2026Updated last week
- SuperCLUE-Agent: 基于中文原生任务的Agent智能体核心能力测评基准☆94Nov 9, 2023Updated 2 years ago
- An Open-sourced Knowledgable Large Language Model Framework.☆1,384Jan 11, 2025Updated last year
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- 轩辕:度小满中文金融对话大模型☆1,307Jan 7, 2025Updated last year
- 怎么训练一个LLM分词器☆152Jul 13, 2023Updated 2 years ago
- We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tunin…☆2,801Dec 12, 2023Updated 2 years ago
- pCLUE: 1000000+多任务提示学习数据集