用大模型批量处理数据,现支持各种大模型做OCR,支持通义千问, 月之暗面, 百度飞桨OCR, OpenAI 和LLAVA。Use LLM to generate or clean data for academic use. Support OCR with qwen, moonshot, PaddleOCR, OpenAI, Llava.
☆16Sep 15, 2024Updated last year
Alternatives and similar repositories for LLM-Data-Cleaner
Users that are interested in LLM-Data-Cleaner are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- MMMG: A Massive, Multidisciplinary, Multi-Tier Generation Benchmark for Text-to-Image Reasoning [NeurIPS 2025 Poster]☆23Dec 10, 2025Updated 4 months ago
- ☆12Jul 19, 2023Updated 2 years ago
- ☆12May 19, 2024Updated last year
- Code for the paper "Functional Regularization for Reinforcement Learning via Learned Fourier Features"☆20Oct 2, 2022Updated 3 years ago
- ☆25Aug 29, 2025Updated 8 months ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Official Implementation for "Platypose: Calibrated Zero-Shot Multi-Hypothesis 3D Human Motion Estimation"☆14May 6, 2025Updated 11 months ago
- Repository for the COLM 2025 paper SpecDec++: Boosting Speculative Decoding via Adaptive Candidate Lengths☆18Jul 10, 2025Updated 9 months ago
- This is a FastAPI based LLM server. Load multiple LLM models (MLX or llama.cpp) simultaneously using multiprocessing.☆17Apr 8, 2026Updated 3 weeks ago
- ☆18Dec 29, 2023Updated 2 years ago
- XGEN-MM(BLIP3) Autocaptioning Tools☆17Jun 20, 2024Updated last year
- LLM Agents: Landing Page Generation for an E-commerce Platform Using CrewAI, Groq-LangChain and Qdrant☆15May 30, 2024Updated last year
- Sparse Attention with Linear Units☆20Apr 21, 2021Updated 5 years ago
- 100行解决中文模糊实体识别with字典树和编辑距离 Chinese fuzzy entity matching with prefix tree and distance editing☆11Sep 25, 2023Updated 2 years ago
- 针对建筑规范文本数据的知识图谱实体关系提取,知识图谱构建,检索增强生成DEMO☆38Aug 7, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- 山东省第二届数据应用创新创业大赛-主赛场-检验报告单识别-Baseline☆13Jan 15, 2021Updated 5 years ago
- ✨ 大语言模型 (LLM) 的自然 语言数据库查询系统 (RAG) Natural Language Database Query System (RAG) based on LLM✨ (with README in English) 🚩 通过自然语言提问,使用大语言模型智…☆65May 27, 2025Updated 11 months ago
- Precision Knowledge Editing (PKE): A novel method to reduce toxicity in LLMs while preserving performance, with robust evaluations and ha…☆11Nov 26, 2024Updated last year
- An implementation of MSSRM method☆10Mar 23, 2023Updated 3 years ago
- Exploring advanced prompting tools to query SQL database with multiple tables in natural language using LLMs☆16Aug 23, 2024Updated last year
- Implemented a script that automatically adjusts Qwen3's inference and non-inference capabilities, based on an OpenAI-like API. The infere…☆22May 9, 2025Updated 11 months ago
- ragflow中的ocr部分,非官方项目☆54Aug 26, 2024Updated last year
- Learning to segment multi-organ and tumorsfrom multiple partially labeled datasets☆19Apr 8, 2021Updated 5 years ago
- PyTorch implementation of Single image super-resolution based on directional variance attention network (Pattern Recognition2022)☆21Apr 15, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Yolov8-visualizations☆11Mar 10, 2023Updated 3 years ago
- official implementation of Training-free Boost for Open-Vocabulary Object Detection with Confidence Aggregation☆13Apr 15, 2024Updated 2 years ago
- Docker + PaddleOCR + FastAPI☆27Feb 1, 2023Updated 3 years ago
- ☆19Mar 12, 2025Updated last year
- ☆13Jan 22, 2025Updated last year
- ☆16Jan 16, 2025Updated last year
- ☆14May 1, 2023Updated 3 years ago
- Official implementation of Tuna-2: Pixel Embeddings Beat Vision Encoders for Unified Understanding and Generation☆411Updated this week
- 2020湖南省第一届人工智能大赛参赛作品☆11Feb 17, 2022Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- [ECCV 2022] "TALISMAN: Targeted Active Learning for Object Detection with Rare Classes and Slices using Submodular Mutual Information" by…☆10Sep 21, 2022Updated 3 years ago
- ROUTE: Robust Multitask Tuning and Collaboration for Text-to-SQL (ICLR 2025 Pytorch Code)☆17May 15, 2025Updated 11 months ago
- Zen-NAS, a lightning fast, training-free Neural Architecture Searching algorithm☆11Nov 12, 2021Updated 4 years ago
- AI Infra LLM infer/ tensorrt-llm/ vllm☆24Updated this week
- Code release for "Category-Specific Prompts for Animal Action Recognition with Pretrained Vision-Language Models"☆14Feb 21, 2024Updated 2 years ago
- ☆25Sep 3, 2025Updated 7 months ago
- yolo目标检测算法☆15Jul 27, 2025Updated 9 months ago