用大模型批量处理数据,现支持各种大模型做OCR,支持通义千问, 月之暗面, 百度飞桨OCR, OpenAI 和LLAVA。Use LLM to generate or clean data for academic use. Support OCR with qwen, moonshot, PaddleOCR, OpenAI, Llava.
☆17Sep 15, 2024Updated last year
Alternatives and similar repositories for LLM-Data-Cleaner
Users that are interested in LLM-Data-Cleaner are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Building a quick conversation-based search demo with langchain.☆10Apr 2, 2024Updated 2 years ago
- MMMG: A Massive, Multidisciplinary, Multi-Tier Generation Benchmark for Text-to-Image Reasoning [NeurIPS 2025 Poster]☆24Dec 10, 2025Updated 6 months ago
- ☆12Jul 19, 2023Updated 2 years ago
- 基于电商导购机器人,自然语言理解(NLU),文本纠错,歧义词消歧☆12May 5, 2020Updated 6 years ago
- Accelerating GOT-OCRv2 with VLLM☆10Nov 15, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Demo app with Loguru logging, async middleware to generate X-request-Id. Works with Gunicorn or Uvicorn, and is safe to use with async/th…☆10Feb 2, 2022Updated 4 years ago
- ☆25Aug 29, 2025Updated 10 months ago
- Repository for the COLM 2025 paper SpecDec++: Boosting Speculative Decoding via Adaptive Candidate Lengths☆19Jul 10, 2025Updated 11 months ago
- A minimal LLM sales agent framework for sales agent fast deployment and benchmark. Support OpenAI models, Claude, HuggingFace models, Gem…☆20Sep 6, 2024Updated last year
- A simple wrapper around "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching" that provides an OpenAI-compatibl…☆14Feb 7, 2025Updated last year
- ☆18Dec 29, 2023Updated 2 years ago
- LLM Agents: Landing Page Generation for an E-commerce Platform Using CrewAI, Groq-LangChain and Qdrant☆15May 30, 2024Updated 2 years ago
- Sparse Attention with Linear Units☆20Apr 21, 2021Updated 5 years ago
- 100行解决中文模糊实体识别with字典树和编辑距离 Chinese fuzzy entity matching with prefix tree and distance editing☆11Sep 25, 2023Updated 2 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆10Apr 30, 2025Updated last year
- 针对建筑规范文本数据的知识图谱实体关系提取,知识图谱构建,检索增强生成DEMO☆39Aug 7, 2024Updated last year
- ✨ 大语言模型 (LLM) 的自然语言数据库查询系统 (RAG) Natural Language Database Query System (RAG) based on LLM✨ (with README in English) 🚩 通过自然语言提问,使用大语言模型智…☆65May 27, 2025Updated last year
- ☆20Apr 24, 2024Updated 2 years ago
- Vstream - Video Analytics pipeline with Hardware based accelerations (dev - stage)☆10Feb 2, 2024Updated 2 years ago
- An implementation of MSSRM method☆10Mar 23, 2023Updated 3 years ago
- Welcome to the Medical LLM repository, your premier gateway to a structured and meticulously designed pipeline exclusively crafted for la…☆19Apr 10, 2024Updated 2 years ago
- 基于deepseek、qwen3大模型,lora sft 医疗行业数据☆15Apr 10, 2026Updated 2 months ago
- Unsupervised domain adaptation for cross-modality liver segmentation via joint adversarial learning and self-learning☆16Feb 10, 2023Updated 3 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Local DeepSearch (Advantage: Low Threshold): an implementation of Agentic RAG based on DeepSeek-R1 API and Tavily API☆17Jun 21, 2025Updated last year
- DETR tensor去除推理过程无用辅助头+fp16部署再次加速+解决转tensorrt 输出全为0问题的新方法。☆11Jan 9, 2024Updated 2 years ago
- Exploring advanced prompting tools to query SQL database with multiple tables in natural language using LLMs☆16Aug 23, 2024Updated last year
- Implemented a script that automatically adjusts Qwen3's inference and non-inference capabilities, based on an OpenAI-like API. The infere…☆22May 9, 2025Updated last year
- official implementation of Training-free Boost for Open-Vocabulary Object Detection with Confidence Aggregation☆13Apr 15, 2024Updated 2 years ago
- Docker + PaddleOCR + FastAPI☆28Feb 1, 2023Updated 3 years ago
- ☆20Mar 12, 2025Updated last year
- Finetune and Inference Qwen3-0.6B.☆29May 5, 2025Updated last year
- ASR on WS, POST/GET FAST_API Can use many RU asr models.☆19Jun 11, 2026Updated 3 weeks ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆13Jan 22, 2025Updated last year
- This repository is part of an NLP course for humanities and cultural studies. This course uses historical newspapers as a source and appl…☆20Jun 5, 2025Updated last year
- ☆16Jan 16, 2025Updated last year
- ☆14May 1, 2023Updated 3 years ago
- 集成了LLM与SDXL的AIGC应用程序☆29Jan 3, 2024Updated 2 years ago
- [ACL 2025] iAgent: LLM Agent as a Shield between User and Recommender Systems☆32May 23, 2025Updated last year
- 2020湖南省第一届人工智能大赛参赛作品☆11Feb 17, 2022Updated 4 years ago