用大模型批量处理数据,现支持各种大模型做OCR,支持通义千问, 月之暗面, 百度飞桨OCR, OpenAI 和LLAVA。Use LLM to generate or clean data for academic use. Support OCR with qwen, moonshot, PaddleOCR, OpenAI, Llava.
☆17Sep 15, 2024Updated last year
Alternatives and similar repositories for LLM-Data-Cleaner
Users that are interested in LLM-Data-Cleaner are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Web app to remove images background☆13Aug 6, 2024Updated last year
- ☆12May 19, 2024Updated 2 years ago
- ☆25Aug 29, 2025Updated 8 months ago
- Official Implementation for "Platypose: Calibrated Zero-Shot Multi-Hypothesis 3D Human Motion Estimation"☆14May 6, 2025Updated last year
- Repository for the COLM 2025 paper SpecDec++: Boosting Speculative Decoding via Adaptive Candidate Lengths☆18Jul 10, 2025Updated 10 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- 条件随机场(CRF)的pytorch实现☆10Mar 7, 2021Updated 5 years ago
- This is a FastAPI based LLM server. Load multiple LLM models (MLX or llama.cpp) simultaneously using multiprocessing.☆17Apr 8, 2026Updated last month
- ☆18Dec 29, 2023Updated 2 years ago
- XGEN-MM(BLIP3) Autocaptioning Tools☆17Jun 20, 2024Updated last year
- Sparse Attention with Linear Units☆20Apr 21, 2021Updated 5 years ago
- A PyTorch implementation of Vector Quantized Variational Autoencoder (VQ-VAE) with EMA updates, pretrained encoder, and K-means initializ…☆21Mar 26, 2026Updated last month
- 100行解决中文模糊实体识别with字典树和编辑距离 Chinese fuzzy entity matching with prefix tree and distance editing☆11Sep 25, 2023Updated 2 years ago
- 针对建筑规范文本数据的知识图谱实体关系提取,知识图谱构建,检索增强生成DEMO☆39Aug 7, 2024Updated last year
- ✨ 大语言模型 (LLM) 的自然语言数据库查询系统 (RAG) Natural Language Database Query System (RAG) based on LLM✨ (with README in English) 🚩 通过自然语言提问,使用大语言模型智…☆65May 27, 2025Updated 11 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- 研究生课程笔记。包含组合数学、高级算法设计与分析、最优化理论与应用、大数据分析与挖掘。☆15Dec 17, 2023Updated 2 years ago
- Vstream - Video Analytics pipeline with Hardware based accelerations (dev - stage)☆10Feb 2, 2024Updated 2 years ago
- Precision Knowledge Editing (PKE): A novel method to reduce toxicity in LLMs while preserving performance, with robust evaluations and ha…☆11Nov 26, 2024Updated last year
- Local DeepSearch (Advantage: Low Threshold): an implementation of Agentic RAG based on DeepSeek-R1 API and Tavily API☆17Jun 21, 2025Updated 11 months ago
- [CVPR 2025] Official Implementation of "MixerMDM: Learnable Composition of Human Motion Diffusion Models".☆25Sep 8, 2025Updated 8 months ago
- Exploring advanced prompting tools to query SQL database with multiple tables in natural language using LLMs☆16Aug 23, 2024Updated last year
- Implemented a script that automatically adjusts Qwen3's inference and non-inference capabilities, based on an OpenAI-like API. The infere…☆22May 9, 2025Updated last year
- ragflow中的ocr部分,非官方项目☆54Aug 26, 2024Updated last year
- Learning to segment multi-organ and tumorsfrom multiple partially labeled datasets☆19Apr 8, 2021Updated 5 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- 「城语」APP基于A级景区、历史古迹、文物保护单位等基础数据,利用先进的大模型能力实现智能化的Citywalk 路线规划,包括设计一条路线、生成路线攻略、生成景点的推荐理由等三大核心功能;利用大模型减少了人工编辑和推荐的工作量,并可以根据游客的需求进行个性化定制,提升了游客…☆19Feb 20, 2024Updated 2 years ago
- ☆20Mar 12, 2025Updated last year
- ASR on WS, POST/GET FAST_API Can use many RU asr models.☆19May 12, 2026Updated last week
- ☆13Jan 22, 2025Updated last year
- 电子科技大学高级计算机视觉课程的作业代码☆13Sep 5, 2020Updated 5 years ago
- ☆16Jan 16, 2025Updated last year
- ☆14May 1, 2023Updated 3 years ago
- [ACL 2025] iAgent: LLM Agent as a Shield between User and Recommender Systems☆31May 23, 2025Updated 11 months ago
- [COLM 2024] SKVQ: Sliding-window Key and Value Cache Quantization for Large Language Models☆24Oct 5, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- 2020湖南省第一届人工智能大赛参赛作品☆11Feb 17, 2022Updated 4 years ago
- ROUTE: Robust Multitask Tuning and Collaboration for Text-to-SQL (ICLR 2025 Pytorch Code)☆17May 15, 2025Updated last year
- Zen-NAS, a lightning fast, training-free Neural Architecture Searching algorithm☆11Nov 12, 2021Updated 4 years ago
- ☆12Aug 21, 2024Updated last year
- Code release for "Category-Specific Prompts for Animal Action Recognition with Pretrained Vision-Language Models"☆14Feb 21, 2024Updated 2 years ago
- ☆25Sep 3, 2025Updated 8 months ago
- ☆13May 26, 2025Updated 11 months ago