用大模型批量处理数据,现支持各种大模型做OCR,支持通义千问, 月之暗面, 百度飞桨OCR, OpenAI 和LLAVA。Use LLM to generate or clean data for academic use. Support OCR with qwen, moonshot, PaddleOCR, OpenAI, Llava.
☆17Sep 15, 2024Updated last year
Alternatives and similar repositories for LLM-Data-Cleaner
Users that are interested in LLM-Data-Cleaner are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- MMMG: A Massive, Multidisciplinary, Multi-Tier Generation Benchmark for Text-to-Image Reasoning [NeurIPS 2025 Poster]☆24Dec 10, 2025Updated 6 months ago
- Learn how to create impactful AI Agents using Agno AI Python Package☆13Jul 31, 2025Updated 10 months ago
- ☆12Jul 19, 2023Updated 2 years ago
- 基于电商导购机器人,自然语言理解(NLU),文本纠错,歧义词消歧☆12May 5, 2020Updated 6 years ago
- an attempt at implementing deep learning model proposed in paper teaching robots to draw☆11Aug 13, 2021Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- 大模型入门☆20Mar 16, 2024Updated 2 years ago
- Accelerating GOT-OCRv2 with VLLM☆10Nov 15, 2024Updated last year
- Demo app with Loguru logging, async middleware to generate X-request-Id. Works with Gunicorn or Uvicorn, and is safe to use with async/th…☆10Feb 2, 2022Updated 4 years ago
- ☆12May 19, 2024Updated 2 years ago
- Code for the paper "Functional Regularization for Reinforcement Learning via Learned Fourier Features"☆20Oct 2, 2022Updated 3 years ago
- Official Implementation for "Platypose: Calibrated Zero-Shot Multi-Hypothesis 3D Human Motion Estimation"☆15May 6, 2025Updated last year
- Repository for the COLM 2025 paper SpecDec++: Boosting Speculative Decoding via Adaptive Candidate Lengths☆19Jul 10, 2025Updated 11 months ago
- 条件随机场(CRF)的pytorch实现☆10Mar 7, 2021Updated 5 years ago
- This is a FastAPI based LLM server. Load multiple LLM models (MLX or llama.cpp) simultaneously using multiprocessing.☆17Apr 8, 2026Updated 2 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- XGEN-MM(BLIP3) Autocaptioning Tools☆17Jun 20, 2024Updated last year
- LLM Agents: Landing Page Generation for an E-commerce Platform Using CrewAI, Groq-LangChain and Qdrant☆15May 30, 2024Updated 2 years ago
- A PyTorch implementation of Vector Quantized Variational Autoencoder (VQ-VAE) with EMA updates, pretrained encoder, and K-means initializ…☆22Mar 26, 2026Updated 2 months ago
- 100行解决中文模糊实体识别with字典树和编辑距离 Chinese fuzzy entity matching with prefix tree and distance editing☆11Sep 25, 2023Updated 2 years ago
- Self-supervised method for completing partial LiDAR point clouds. Trained and tested on ShapeNet and SemanticKITTI in TensorFlow. (BMVC 2…☆14Oct 15, 2022Updated 3 years ago
- ☆10Apr 30, 2025Updated last year
- 针对建筑规范文本数据的知识图谱实体关系提取,知识图谱构建,检索增强生成DEMO☆39Aug 7, 2024Updated last year
- 山东省第二届数据应用创新创业大赛-主赛场-检验报告单识别-Baseline☆13Jan 15, 2021Updated 5 years ago
- ✨ 大语言模型 (LLM) 的自然语言数据库查询系统 (RAG) Natural Language Database Query System (RAG) based on LLM✨ (with README in English) 🚩 通过自然语言提问,使用大语言模型智…☆65May 27, 2025Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Vstream - Video Analytics pipeline with Hardware based accelerations (dev - stage)☆10Feb 2, 2024Updated 2 years ago
- Precision Knowledge Editing (PKE): A novel method to reduce toxicity in LLMs while preserving performance, with robust evaluations and ha…☆11Nov 26, 2024Updated last year
- An implementation of MSSRM method☆10Mar 23, 2023Updated 3 years ago
- Pytorch implementation of MoLA☆22Jun 9, 2025Updated last year
- 基于deepseek、qwen3大模型,lora sft 医疗行业数据☆15Apr 10, 2026Updated 2 months ago
- Unsupervised domain adaptation for cross-modality liver segmentation via joint adversarial learning and self-learning☆16Feb 10, 2023Updated 3 years ago
- Local DeepSearch (Advantage: Low Threshold): an implementation of Agentic RAG based on DeepSeek-R1 API and Tavily API☆17Jun 21, 2025Updated 11 months ago
- DETR tensor去除推理过程无用辅助头+fp16部署再次加速+解决转tensorrt 输出全为0问题的新方法。☆11Jan 9, 2024Updated 2 years ago
- Implemented a script that automatically adjusts Qwen3's inference and non-inference capabilities, based on an OpenAI-like API. The infere…☆22May 9, 2025Updated last year
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Yolov8-visualizations☆11Mar 10, 2023Updated 3 years ago
- Code for Rethinking Prompt Optimizers: From Prompt Merits to Optimization☆13Jan 12, 2026Updated 5 months ago
- official implementation of Training-free Boost for Open-Vocabulary Object Detection with Confidence Aggregation☆13Apr 15, 2024Updated 2 years ago
- Docker + PaddleOCR + FastAPI☆28Feb 1, 2023Updated 3 years ago
- ☆20Mar 12, 2025Updated last year
- ☆13Jan 22, 2025Updated last year
- 电子科技大学高级计算机视觉课程的作业代码☆13Sep 5, 2020Updated 5 years ago