A repo for update and debug Mixtral-7x8B、MOE、ChatGLM3、LLaMa2、 BaChuan、Qwen an other LLM models include new models mixtral, mixtral 8x7b, training, evaluate and application!
☆47Oct 8, 2025Updated 5 months ago
Alternatives and similar repositories for quickllm
Users that are interested in quickllm are comparing it to the libraries listed below
Sorting:
- Run pytorch models on GPU Android with Vulkan backend☆10Aug 15, 2023Updated 2 years ago
- llms related stuff , including code, docs☆13Feb 25, 2025Updated last year
- pre-training llama3 using chinese☆13May 1, 2024Updated last year
- 集成Qwen与DeepSeek等先进大语言模型,支持纯LLM+分类层模式及LLM+LoRA+分类层模式,使用transformers模块化设计和训练便于根据需要调整或替换组件。☆19Sep 1, 2025Updated 6 months ago
- LinChance Fine-tuning System 采用 Streamlit 结合 LLaMA-Factory 打造的模型微调 Web UI☆14Feb 4, 2024Updated 2 years ago
- 基于Llama3,通过进一步CPT,SFT,ORPO得到的中文版Llama3☆17Apr 24, 2024Updated last year
- ☆11Updated this week
- 使用多轮对话数据集对deepseek进行lora微调教程☆60Dec 26, 2024Updated last year
- Luann (fka TypeAgent) allows you to create many LLM based agent(Various types of agent,scale up)☆24Feb 9, 2026Updated last month
- Implemented a script that automatically adjusts Qwen3's inference and non-inference capabilities, based on an OpenAI-like API. The infere…☆22May 9, 2025Updated 10 months ago
- 介绍docker、docker compose的使用。☆21Sep 4, 2024Updated last year
- A new way to generate large quantities of high quality synthetic data (on par with GPT-4), with better controllability, at a fraction of …☆23Oct 1, 2024Updated last year
- Deepdive: Deep iterative thinking slash command for Claude Code - enables multi-round exploratory reasoning and non-linear problem-solvin…☆46Nov 9, 2025Updated 4 months ago
- A Generative Dialogue State Tracking Model☆22Jun 24, 2021Updated 4 years ago
- Embed your LLM into a python function☆22Jan 9, 2025Updated last year
- ☆24May 21, 2025Updated 9 months ago
- 大 家好!我是功能丰富的 MCP 服务,旨在打破设备与服务的隔阂,为用户带来便捷体验。 天气工具和气象平台联动,快速为用户推送全球实时天气,助力大家规划出行。控制浏览器工具模拟人工操作,自动搜索、浏览网页,大幅节省时间。摄像头工具调用本地摄像头拍照、录像,实现人脸识别,保障家…☆14Apr 9, 2025Updated 11 months ago
- ntwork whl and wework 二进制文件备份,注意只能在windows系统使用☆34Apr 30, 2024Updated last year
- A simple WeChat Official Account layout tool based on Dify☆17Jun 27, 2025Updated 8 months ago
- ☆26Feb 28, 2026Updated last week
- Difyで作る生成AIアプリ完全入門☆17May 25, 2025Updated 9 months ago
- SPPNet: An Appoach For Real-Time Encrypted Traffic Classification Using Deep Learning☆10Aug 6, 2024Updated last year
- A repository for a Deep Q-Learning approach to intrusion detection for networks cyber-attacks.☆10Sep 3, 2021Updated 4 years ago
- 安卓远控,天线6.0 无限使用版☆11Sep 9, 2023Updated 2 years ago
- A one-page WebUI integrating VITS inference, training, and output in Sherpa-Onnx format.☆12Feb 2, 2025Updated last year
- 大语言模型应用:RAG、NL2SQL、聊天机器人、预训练、MOE混合专家模型、微调训练、强化学习、天池数据竞赛☆75Feb 10, 2025Updated last year
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper☆32May 29, 2024Updated last year
- AI-powered cryptocurrency trading bot built using deep reinforcement learning (DRL). The bot is designed as a research platform for devel…☆10Jan 18, 2025Updated last year
- Write the database metadata into the dify knowledge☆12Dec 30, 2025Updated 2 months ago
- ☆11Aug 29, 2025Updated 6 months ago
- kun-chat is a lightweight AI conversation app based on Ollama/kun-chat 是一款基于 Ollama 的轻量级 AI 对话应用☆10Jul 16, 2025Updated 7 months ago
- ☆28Dec 4, 2025Updated 3 months ago
- Workflow automation, but you just describe what you want and it happens.☆27Nov 22, 2025Updated 3 months ago
- A full-stack AI-powered business intelligence tool for non-experts, featuring serverless backend processing and a secure Streamlit fronte…☆28Feb 13, 2026Updated 3 weeks ago
- 之江-电商评论观点挖掘的比赛,基于pytorch-transformers版本,暂时只实现了BERT做aspect+opinion+属性分类+情感极性的联合标注,还未加上CRF。☆31Aug 30, 2019Updated 6 years ago
- share data, prompt data , pretraining data☆36Nov 30, 2023Updated 2 years ago
- 这是一个open-r1的复现项目,对0.5B、1.5B、3B、7B的qwen模型进行GRPO训练,观察到一些有趣的现象。☆56Apr 13, 2025Updated 10 months ago
- Replication files for arXiv:1805.03735 Sequence Aggregation Rules for Anomaly Detection in Computer Network Traffic☆11Jan 6, 2019Updated 7 years ago
- A distilled DeepSeek-R1 variant built on Qwen2.5-32B, fine-tuned with curated data for enhanced performance and efficiency. <metadata> gp…☆16Mar 11, 2025Updated 11 months ago