owenliang / DeepSeek-Distill-Qwen-For-Child
☆30Updated 3 weeks ago
Alternatives and similar repositories for DeepSeek-Distill-Qwen-For-Child:
Users that are interested in DeepSeek-Distill-Qwen-For-Child are comparing it to the libraries listed below
- 通义千问的DPO训练☆40Updated 6 months ago
- Qwen1.5-SFT(阿里, Ali), Qwen_Qwen1.5-2B-Chat/Qwen_Qwen1.5-7B-Chat微调(transformers)/LORA(peft)/推理☆55Updated 10 months ago
- the newest version of llama3,source code explained line by line using Chinese☆22Updated 11 months ago
- ☆26Updated 5 months ago
- LLM RAG 应用,支持 API 调用,语音交互。☆11Updated 9 months ago
- ThinkLLM: 大语言模型算法与组件实现☆26Updated last week
- 模型 llava-Qwen2-7B-Instruct-Chinese-CLIP 增强中文文字识别能力和表情包内涵识别能力,接近gpt4o、claude-3.5-sonnet的识别水平!☆22Updated 8 months ago
- A unified tool to generate fine-tuning datasets for LLMs, including questions, answers, and dialogues. ✨🤖📚💬☆57Updated 2 weeks ago
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper☆30Updated 10 months ago
- 想要从零开始训练一个中文的mini大语言模型,可以进行基本的对话,模型大小根据手头的机器决定☆59Updated 7 months ago
- LLM Tokenizer with BPE algorithm☆31Updated 10 months ago
- 基于Qwen2模型进行通用信息抽取【实体/关系/事件抽取】☆30Updated 8 months ago
- GLM Series Edge Models☆131Updated last month
- simple decoder-only GTP model in pytorch☆37Updated 10 months ago
- LLM+RAG for QA☆21Updated last year
- Repo for for paper "AgentRE: An Agent-Based Framework for Navigating Complex Information Landscapes in Relation Extraction".☆64Updated 8 months ago
- In this fast-paced world, we all need a little something to spice up life. Whether you need a glass of sweet talk to lift your spirits or…☆50Updated 2 months ago
- 最简易的R1结果在小模型上的复现,阐述类O1与DeepSeek R1最重要的本质。Think is all your need。利用实验佐证,对于强推理能力,think思考过程性内容是AGI/ASI的核心。☆40Updated last month
- 大语言模型应用:RAG、NL2SQL、聊天机器人、预训 练、MOE混合专家模型、微调训练、强化学习、天池数据竞赛☆58Updated last month
- 训练一个对中文支持更好的LLaVA模型,并开源训练代码和数据。☆54Updated 6 months ago
- ☆106Updated 9 months ago
- pretrain a wiki llm using transformers☆32Updated 6 months ago
- 顾名思义:手搓的RAG☆121Updated last year
- ☆15Updated 9 months ago
- 大语言模型训练和服务调研☆37Updated last year
- 基于lora微调Qwen1.8chat的实战教程☆26Updated 5 months ago
- 本项目主要介绍prompt工程相关用例。包括模拟智能推荐客服系统构建和问答、思维链、自洽性、思维树等相关进阶demo,旨在帮助大家理解prompt。通过一份代码实现了同时支持多种大模型(如OpenAI、阿里通义千问等)并使用FastAPI对应用进行API封装。☆28Updated 6 months ago
- A minimalist benchmarking tool designed to test the routine-generation capabilities of LLMs.☆21Updated 4 months ago
- from MHA, MQA, GQA to MLA by 苏剑林, with code☆13Updated last month
- 本项目致力于为大模型领域的初学者提供全面的知识体系,包括基础和高阶内容,以便开发者能迅速掌握大模型技术栈并全面了解相关知识。☆51Updated 2 months ago