WALLE-AI / uReasoningLLMs
Deepseek-r1复现科普与资源汇总
☆18Updated 3 weeks ago
Alternatives and similar repositories for uReasoningLLMs:
Users that are interested in uReasoningLLMs are comparing it to the libraries listed below
- ☆30Updated 3 weeks ago
- Fast instruction tuning with Llama2☆11Updated 11 months ago
- LLM+RAG for QA☆21Updated last year
- Music large model based on InternLM2-chat.☆22Updated 3 months ago
- ThinkLLM:大语言模型算法与组件实现☆27Updated last week
- This is a personal reimplementation of Google's Infini-transformer, utilizing a small 2b model. The project includes both model and train…☆56Updated 11 months ago
- 天池算法比赛《BetterMixture - 大模型数据混合挑战赛》的第一名top1解决方案☆28Updated 8 months ago
- GLM Series Edge Models☆131Updated last month
- Qwen-WisdomVast is a large model trained on 1 million high-quality Chinese multi-turn SFT data, 200,000 English multi-turn SFT data, and …☆18Updated 11 months ago
- ☆11Updated 2 weeks ago
- 大语言模型训练和服务调研☆37Updated last year
- 训练一个对中文支持更好的LLaVA模型,并开源训练代码和数据。☆54Updated 6 months ago
- Qwen1.5-SFT(阿里, Ali), Qwen_Qwen1.5-2B-Chat/Qwen_Qwen1.5-7B-Chat微调(transformers)/LORA(peft)/推理☆55Updated 10 months ago
- Code for Robust Fine-tuning (RbFT)☆11Updated 2 months ago
- 本项目用于大模型数学解题能力方面的数据集合成,模型训练及评测,相关文章记录。☆80Updated 6 months ago
- 1.4B sLLM for Chinese and English - HammerLLM🔨☆44Updated 11 months ago
- LLM RAG 应用,支持 API 调用,语音交互。☆11Updated 9 months ago
- A fluent, scalable, and easy-to-use LLM data processing framework.☆16Updated last week
- 视觉信息抽取任务中,使用OCR识别结果规范多模态大模型的回答☆27Updated 3 months ago
- 想要从零开始训练一个中文的mini大语言模型,可以进行基本的对话,模型大小根据手头的机器决定☆59Updated 7 months ago
- ☆18Updated 8 months ago
- Manages vllm-nccl dependency☆17Updated 9 months ago
- Pretrain、decay、SFT a CodeLLM from scratch 🧙♂️☆35Updated 10 months ago
- Efficient, Flexible, and Highly Fault-Tolerant Model Service Management Based on SGLang☆44Updated 4 months ago
- the newest version of llama3,source code explained line by line using Chinese☆22Updated 11 months ago
- This repo offers advanced tutorials for LLMs, BERT-based models, and multimodal models, covering fine-tuning, quantization, vocabulary ex…☆16Updated 2 weeks ago
- RLHF experiments on a single A100 40G GPU. Support PPO, GRPO, REINFORCE, RAFT, RLOO, ReMax, DeepSeek R1-Zero reproducing.☆50Updated last month
- qwen models finetuning☆94Updated 3 weeks ago
- Code for "An Empirical Study of Retrieval Augmented Generation with Chain-of-Thought"☆12Updated 8 months ago
- ☆21Updated 7 months ago