用大模型批量处理数据,现支持各种大模型做OCR,支持通义千问, 月之暗面, 百度飞桨OCR, OpenAI 和LLAVA。Use LLM to generate or clean data for academic use. Support OCR with qwen, moonshot, PaddleOCR, OpenAI, Llava.
☆16Sep 15, 2024Updated last year
Alternatives and similar repositories for LLM-Data-Cleaner
Users that are interested in LLM-Data-Cleaner are comparing it to the libraries listed below
Sorting:
- ☆10Apr 30, 2025Updated 10 months ago
- CFT-RAG: An Entity Tree Based Retrieval Augmented Generation Algorithm With Cuckoo Filter☆22May 28, 2025Updated 9 months ago
- Finetune and Inference Qwen3-0.6B.☆28May 5, 2025Updated 9 months ago
- Yet Another Papers With Code☆35Sep 7, 2025Updated 5 months ago
- Code for Robust Fine-tuning (RbFT)☆17Jan 31, 2025Updated last year
- A minimal LLM sales agent framework for sales agent fast deployment and benchmark. Support OpenAI models, Claude, HuggingFace models, Gem…☆19Sep 6, 2024Updated last year
- vllm混合推理扩展插件,支持多NUMA混合推理,单卡推理Qwen3-Next模型可达1000+ prefill☆31Nov 7, 2025Updated 3 months ago
- ☆11Updated this week
- Implemented a script that automatically adjusts Qwen3's inference and non-inference capabilities, based on an OpenAI-like API. The infere…☆22May 9, 2025Updated 9 months ago
- ☆26May 11, 2025Updated 9 months ago
- A travel agent based on Qwen2.5, fine-tuned by SFT + DPO/PPO/GRPO using traveling question-answer dataset, a mindmap can be output using …☆56Nov 14, 2025Updated 3 months ago
- Codebase for Instruction Following without Instruction Tuning☆36Sep 24, 2024Updated last year
- ☆28Oct 14, 2024Updated last year
- ☆23Updated this week
- Difyで作る生成AIアプリ完全入門☆17May 25, 2025Updated 9 months ago
- A simple WeChat Official Account layout tool based on Dify☆17Jun 27, 2025Updated 8 months ago
- Vstream - Video Analytics pipeline with Hardware based accelerations (dev - stage)☆10Feb 2, 2024Updated 2 years ago
- 针对建筑规范文本数据的知识图谱实体关系提取,知识图谱构建,检索增强生成DEMO☆37Aug 7, 2024Updated last year
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper☆32May 29, 2024Updated last year
- A full-stack AI-powered business intelligence tool for non-experts, featuring serverless backend processing and a secure Streamlit fronte…☆27Feb 13, 2026Updated 2 weeks ago
- ☆43Feb 9, 2026Updated 3 weeks ago
- MMMG: A Massive, Multidisciplinary, Multi-Tier Generation Benchmark for Text-to-Image Reasoning [NeurIPS 2025 Poster]☆23Dec 10, 2025Updated 2 months ago
- Write the database metadata into the dify knowledge☆12Dec 30, 2025Updated 2 months ago
- 基于Qwen2+SFT+DPO的医疗问答系统,项目中使用了自定义的 SFTTrainer/DPOTrainer/TRPOTrainer用于训练,其次,项目还调用各种知识库工具(neo4j, milvus, LDA, 等)进行自动化训练数据生成。另外,使用 vllm 用于推理…☆61Jan 4, 2026Updated last month
- ☆28Dec 4, 2025Updated 2 months ago
- A Multi-Session and Multi-Therapy Benchmark for High-Realism AI Psychological Counselor☆29Jan 13, 2026Updated last month
- ☆11Aug 29, 2025Updated 6 months ago
- Workflow automation, but you just describe what you want and it happens.☆27Nov 22, 2025Updated 3 months ago
- Maximizing the Performance of a Simple RAG using RL☆90Mar 20, 2025Updated 11 months ago
- 2020湖南省第一届人工智能大赛参赛作品☆11Feb 17, 2022Updated 4 years ago
- A universal skills runtime framework SDK for building, deploying, and executing modular capabilities across diverse environments.☆27Updated this week
- ☆10Dec 29, 2023Updated 2 years ago
- 知予人工智能:从学习者到研究者☆13Jan 20, 2025Updated last year
- Python Telegraph api.☆15Mar 22, 2025Updated 11 months ago
- ☆14May 1, 2023Updated 2 years ago
- 🤖AI Agents for Financial Trading💰: LLM-Driven Stock Prediction & Investment Recommendation System☆13Apr 14, 2025Updated 10 months ago
- Learn how to create impactful AI Agents using Agno AI Python Package☆13Jul 31, 2025Updated 7 months ago
- Use the knowledge graph generated by GraphRAG as the external knowledge base for the Dify workflow.☆21Jun 4, 2025Updated 8 months ago
- LangReact 是一个配置化的 Planning Agent 应用开发工具,通过配置、插件,能快速为你的 GPT 应用提供 Planning 功能。☆12Apr 23, 2024Updated last year