CambioML / uniflow-llm-based-pdf-extraction-text-cleaning-data-clustering
LLM-based text extraction from unstructured data like PDFs, Words and HTMLs. Transform and cluster the text into your desired format. Less information loss, more interpretation, and faster R&D!
☆208Updated 10 months ago
Alternatives and similar repositories for uniflow-llm-based-pdf-extraction-text-cleaning-data-clustering:
Users that are interested in uniflow-llm-based-pdf-extraction-text-cleaning-data-clustering are comparing it to the libraries listed below
- Accurate, private and configurable document retrieval LLM☆120Updated 3 weeks ago
- AvaTaR: Optimizing LLM Agents for Tool Usage via Contrastive Reasoning (NeurIPS 2024)☆188Updated 3 weeks ago
- MAKGED is the first multi-agent framework for collaborative error detection in knowledge graphs.☆27Updated last month
- DeepRetrieval - Hacking 🔥Real Search Engines and Text/Data Retrievers with LLM + RL☆196Updated this week
- The Official Repo of ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level Code (https://a…☆292Updated 4 months ago
- Awesome-GraphRAG: A curated list of resources (surveys, papers, benchmarks, and opensource projects) on graph-based retrieval-augmented g…☆864Updated this week
- Pytorch Library for Relational Table Learning with LLMs.☆422Updated this week
- STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases (NeurIPS D&B 2024)☆305Updated 2 months ago
- "GraphAgent: Agentic Graph Language Assistant"☆292Updated last month
- "AnyGraph: Graph Foundation Model in the Wild"☆207Updated 6 months ago
- Babel - Open Multilingual Large Language Models Serving Over 90% of Global Speakers☆196Updated 3 weeks ago
- A curated list of awesome leaderboard-oriented resources for foundation models☆259Updated 3 weeks ago
- Your Automatic Prompt Engineering Assistant for GenAI Applications☆2,091Updated 11 months ago
- The official implementation of Self-Play Preference Optimization (SPPO)☆508Updated 2 months ago
- Common AI agent framework solving your data problems☆125Updated last month
- An AI agent powered by LLMs that streamlines the entire process of data analysis. 🚀☆390Updated 7 months ago
- Create your self-hosted, open-source Operator model.☆91Updated last week
- A Contamination-free Multi-task Language Understanding Benchmark☆114Updated 2 months ago
- The official implementation of the ICML 2024 paper "MemoryLLM: Towards Self-Updatable Large Language Models" and "M+: Extending MemoryLLM…☆131Updated last month
- [ICLR Workshop 2025] An official source code for paper "GuardReasoner: Towards Reasoning-based LLM Safeguards".☆129Updated 3 weeks ago
- In-depth study of the graphrag☆571Updated last week
- Code for paper "GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators"☆195Updated 8 months ago
- An ultra-lightweight agentic AI framework based on the ReAct paradigm.☆190Updated this week
- ☆31Updated 4 months ago
- pykoi: Active learning in one unified interface☆410Updated last year
- TxBKG - Knowledge Graph Generation for Any PDFs☆181Updated 4 months ago
- BIRD-CRITIC 1.0: Can Large Language Models Solve USER SQL Issues in Real-World Database Applications?☆395Updated last month
- The repository for the paper titled "Leopard: A Vision Language Model For Text-Rich Multi-Image Tasks"☆154Updated 3 months ago
- Mulberry, an o1-like Reasoning and Reflection MLLM Implemented via Collective MCTS☆1,077Updated last week
- A toolkit enhances PyTorch with specialized functions for low-bit quantized neural networks.☆197Updated 9 months ago