CambioML / uniflow-llm-based-pdf-extraction-text-cleaning-data-clustering
LLM-based text extraction from unstructured data like PDFs, Words and HTMLs. Transform and cluster the text into your desired format. Less information loss, more interpretation, and faster R&D!
☆196Updated 8 months ago
Alternatives and similar repositories for uniflow-llm-based-pdf-extraction-text-cleaning-data-clustering:
Users that are interested in uniflow-llm-based-pdf-extraction-text-cleaning-data-clustering are comparing it to the libraries listed below
- Accurate, private and configurable document retrieval LLM☆120Updated last month
- GraphRAG-survey: A curated list of resources on graph-based retrieval-augmented generation for customized large language models.☆558Updated this week
- The Official Repo of ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level Code (https://a…☆287Updated 3 months ago
- AvaTaR: Optimizing LLM Agents for Tool Usage via Contrastive Reasoning (NeurIPS 2024)☆178Updated last month
- A curated list of awesome leaderboard-oriented resources for foundation models☆253Updated last month
- Pytorch Library for Relational Table Learning with LLMs.☆316Updated this week
- An AI agent powered by LLMs that streamlines the entire process of data analysis. 🚀☆365Updated 6 months ago
- The repository for the paper titled "Leopard: A Vision Language Model For Text-Rich Multi-Image Tasks"☆152Updated last month
- The official implementation of Self-Play Preference Optimization (SPPO)☆481Updated 3 weeks ago
- Mulberry, an o1-like Reasoning and Reflection MLLM Implemented via Collective MCTS☆618Updated this week
- MPLSandbox is an out-of-the-box multi-programming language sandbox designed to provide unified and comprehensive feedback from compiler a…☆172Updated 3 months ago
- [ACL2024 Findings] Knowledge-to-SQL: Enhancing SQL Generation with Data Expert LLM☆56Updated last month
- STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases (NeurIPS D&B 2024)☆298Updated last month
- [ICLR 2025] Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows☆332Updated last week
- TxBKG - Knowledge Graph Generation for Any PDFs☆180Updated 2 months ago
- "GraphAgent: Agentic Graph Language Assistant"☆258Updated last week
- An ultra-lightweight agentic AI framework based on the ReAct paradigm.☆184Updated 2 weeks ago
- "AnyGraph: Graph Foundation Model in the Wild"☆205Updated 5 months ago
- Your Automatic Prompt Engineering Assistant for GenAI Applications☆2,081Updated 9 months ago
- Build multimodal language agents for fast prototype and production☆1,777Updated this week
- A toolkit enhances PyTorch with specialized functions for low-bit quantized neural networks.☆196Updated 7 months ago
- Common AI agent framework solving your data problems☆122Updated last month
- DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Models☆127Updated last month
- The official code for "Olympus: A Universal Task Router for Computer Vision Tasks"☆46Updated 2 months ago
- An opensource legal prompts☆353Updated last year
- Code for paper "GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators"☆193Updated 6 months ago
- One-stop data intelligence agent, providing insights from all mainstream data formats in a single dialogue box, including documents, data…☆382Updated 3 months ago