yiyepiaoling0715 / codellm-data-preprocess-pipelineView external linksLinks
代码大模型 预训练&微调&DPO 数据处理 业界处理pipeline sota
☆51Jul 25, 2024Updated last year
Alternatives and similar repositories for codellm-data-preprocess-pipeline
Users that are interested in codellm-data-preprocess-pipeline are comparing it to the libraries listed below
Sorting:
- 介绍docker、docker compose的使用。☆21Sep 4, 2024Updated last year
- ICSE'22 - Havoc-MAB: Enhancing AFL havoc mutation with Two-layer Multi-Armed Bandit☆12Sep 19, 2022Updated 3 years ago
- Dify Streamlit Chat App☆14Aug 31, 2024Updated last year
- 天池算法比赛《BetterMixture - 大模型数据混合挑战赛》的第一名top1解决方案☆34Jul 7, 2024Updated last year
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper☆32May 29, 2024Updated last year
- source code of EfficientTTS 2☆20Feb 18, 2024Updated last year
- 用大模型批量处理数据,现支持各种大模型做OCR,支持通义千问, 月之暗面, 百度飞桨OCR, OpenAI 和LLAVA。Use LLM to generate or clean data for academic use. Support OCR with qwen, m…☆16Sep 15, 2024Updated last year
- A minimal LLM sales agent framework for sales agent fast deployment and benchmark. Support OpenAI models, Claude, HuggingFace models, Gem…☆19Sep 6, 2024Updated last year
- ☆11Feb 6, 2026Updated last week
- 基于FunASR实现语音识别,包含常规版和ONNX版(推荐)。☆48Oct 12, 2024Updated last year
- Qwen-WisdomVast is a large model trained on 1 million high-quality Chinese multi-turn SFT data, 200,000 English multi-turn SFT data, and …☆18Apr 12, 2024Updated last year
- The official code repo and data hub of top_nsigma sampling strategy for LLMs.☆26Feb 11, 2025Updated last year
- 通用简单工具项目☆22Oct 6, 2024Updated last year
- Replication package for ICSE2023 paper-CoCoSoDa: Effective Contrastive Learning for Code Search☆24Mar 28, 2023Updated 2 years ago
- 实现使用开源的LangFlow框架,零代码实现大模型相关应用如流量包推荐智能客服、RAG应用等,并使用两种方式将创建的工作流集成到自己的项目中☆31Sep 9, 2024Updated last year
- ☆31Oct 2, 2024Updated last year
- ☆27Jul 25, 2023Updated 2 years ago
- LLM-based Multi-Agent 系统架构设计与项目代码实践☆34Nov 30, 2024Updated last year
- Qwen1.5-SFT(阿里, Ali), Qwen_Qwen1.5-2B-Chat/Qwen_Qwen1.5-7B-Chat微调(transformers)/LORA(peft)/推理☆71May 17, 2024Updated last year
- A simple WeChat Official Account layout tool based on Dify☆16Jun 27, 2025Updated 7 months ago
- Y-Agent Studio 是一个面向 企业级应用 的Agent开发套,Y-Agent是其中的核心模块。 包含了:支持智能体编排、RAG、流程日志、单元测试、流程测试、语料生产等垂直领域非常需要的功能。 智能体编排可以在同一个流程中,同时支持多智能体协作和流程混合编排…☆25Oct 4, 2025Updated 4 months ago
- Difyで作る生成AIアプリ完全入門☆17May 25, 2025Updated 8 months ago
- ☆42Mar 6, 2025Updated 11 months ago
- ☆19Jan 29, 2026Updated 2 weeks ago
- ☆29Aug 30, 2024Updated last year
- Chinese-Mistral: An Efficient and Effective Chinese Large Language Model☆32Jun 22, 2025Updated 7 months ago
- 100 Production-Ready Claude Code Skills - The most comprehensive collection of AI skills for sales, business automation, content creation…☆35Oct 22, 2025Updated 3 months ago
- 北语 246 实验室新生简明指南☆10May 30, 2022Updated 3 years ago
- ☆36Mar 18, 2025Updated 10 months ago
- ☆28Dec 4, 2025Updated 2 months ago
- A full-stack AI-powered business intelligence tool for non-experts, featuring serverless backend processing and a secure Streamlit fronte…☆25Jan 6, 2026Updated last month
- Write the database metadata into the dify knowledge☆12Dec 30, 2025Updated last month
- Workflow automation, but you just describe what you want and it happens.☆26Nov 22, 2025Updated 2 months ago
- ☆11Aug 29, 2025Updated 5 months ago
- mcp的webui界面,支持客户端连接多个sse服务端,支持 openai、deepseek、qwen等大模型,另外附上构建的 agent的 stdio和sse的简单 天气查询的完整示例☆41May 23, 2025Updated 8 months ago
- XVERSE-MoE-A36B: A multilingual large language model developed by XVERSE Technology Inc.☆38Sep 12, 2024Updated last year
- ☆89Jan 27, 2026Updated 2 weeks ago
- A collection of practical code generation tasks and tests in open source projects. Complementary to HumanEval by OpenAI.☆154Dec 25, 2024Updated last year
- ☆28Jun 27, 2025Updated 7 months ago