用大模型批量处理数据,现支持各种大模型做OCR,支持通义千问, 月之暗面, 百度飞桨OCR, OpenAI 和LLAVA。Use LLM to generate or clean data for academic use. Support OCR with qwen, moonshot, PaddleOCR, OpenAI, Llava.
☆16Sep 15, 2024Updated last year
Alternatives and similar repositories for LLM-Data-Cleaner
Users that are interested in LLM-Data-Cleaner are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆12Jul 19, 2023Updated 2 years ago
- 大模型入门☆20Mar 16, 2024Updated 2 years ago
- Accelerating GOT-OCRv2 with VLLM☆10Nov 15, 2024Updated last year
- ☆12May 19, 2024Updated last year
- ☆25Aug 29, 2025Updated 7 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Tingshu 听舒 | Bringing the author’s voice directly to you☆33Dec 3, 2024Updated last year
- 条件随机场(CRF)的pytorch实现☆10Mar 7, 2021Updated 5 years ago
- A simple wrapper around "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching" that provides an OpenAI-compatibl…☆14Feb 7, 2025Updated last year
- This is a FastAPI based LLM server. Load multiple LLM models (MLX or llama.cpp) simultaneously using multiprocessing.☆17Updated this week
- Sparse Attention with Linear Units☆20Apr 21, 2021Updated 4 years ago
- ☆10Apr 30, 2025Updated 11 months ago
- 针对建筑规范文本数据的知识图谱实体关系提取,知识图谱构建,检索增强生成DEMO☆37Aug 7, 2024Updated last year
- 山东省第二届数据应用创新创业大赛-主赛场-检验报告单识别-Baseline☆13Jan 15, 2021Updated 5 years ago
- Welcome to the Medical LLM repository, your premier gateway to a structured and meticulously designed pipeline exclusively crafted for la…☆18Apr 10, 2024Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- 基于deepseek、qwen3大模型,lora sft 医疗行业数据☆15Mar 25, 2026Updated 2 weeks ago
- Local DeepSearch (Advantage: Low Threshold): an implementation of Agentic RAG based on DeepSeek-R1 API and Tavily API☆17Jun 21, 2025Updated 9 months ago
- DETR tensor去除推理过程无用辅助头+fp16部署再次加速+解决转tensorrt 输出全为0问题的新方法。☆11Jan 9, 2024Updated 2 years ago
- Implemented a script that automatically adjusts Qwen3's inference and non-inference capabilities, based on an OpenAI-like API. The infere…☆22May 9, 2025Updated 11 months ago
- Yolov8-visualizations☆11Mar 10, 2023Updated 3 years ago
- Code for Rethinking Prompt Optimizers: From Prompt Merits to Optimization☆13Jan 12, 2026Updated 3 months ago
- ☆35Oct 22, 2025Updated 5 months ago
- ASR on WS, POST/GET FAST_API Can use many RU asr models.☆19Jan 27, 2026Updated 2 months ago
- ☆13Jan 22, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- ☆16Jan 16, 2025Updated last year
- ☆14May 1, 2023Updated 2 years ago
- ☆17Jun 10, 2025Updated 10 months ago
- [COLM 2024] SKVQ: Sliding-window Key and Value Cache Quantization for Large Language Models☆24Oct 5, 2024Updated last year
- [ECCV 2022] "TALISMAN: Targeted Active Learning for Object Detection with Rare Classes and Slices using Submodular Mutual Information" by…☆10Sep 21, 2022Updated 3 years ago
- Zen-NAS, a lightning fast, training-free Neural Architecture Searching algorithm☆11Nov 12, 2021Updated 4 years ago
- Code release for "Category-Specific Prompts for Animal Action Recognition with Pretrained Vision-Language Models"☆14Feb 21, 2024Updated 2 years ago
- ☆25Sep 3, 2025Updated 7 months ago
- yolo目标检测算法☆15Jul 27, 2025Updated 8 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ☆13May 26, 2025Updated 10 months ago
- This is the official code repository for "Deep learning model for coronary artery segmentation and quantitative stenosis detection in ang…☆29Jan 12, 2026Updated 3 months ago
- Pipeline-Parallel Lecture: Simplest Dualpipe Implementation.☆31Sep 17, 2025Updated 6 months ago
- [AAAI 2026] ReCode: Reinforced Code Knowledge Editing for API Updates☆24Jul 1, 2025Updated 9 months ago
- 一个用于训练句子embedding的工具,支持Cosent以及Simcse、infonce☆23Jun 17, 2025Updated 9 months ago
- ☆10Feb 17, 2024Updated 2 years ago
- ☆18Apr 7, 2025Updated last year