lperezmo / embeddings-extraction
Scripts for reading, extracting, and organizing data from either HTML or PDF documents and prepare them to be converted into embeddings for use in context-augmented LLM queries.
☆12Updated 5 months ago
Alternatives and similar repositories for embeddings-extraction:
Users that are interested in embeddings-extraction are comparing it to the libraries listed below
- Chat Complex PDF with Tables Using IBM WatsonX, Langchain and LlamaParser.☆11Updated 9 months ago
- 🚀 Scale your RAG pipeline using Ragswift: A scalable centralized embeddings management platform☆37Updated last year
- A swarm of LLM agents that will help you test, document, and productionize your code!☆13Updated this week
- OpenAI compatible API for open source LLMs☆15Updated last year
- AI_Powered_Dev_Search_Engine☆12Updated 11 months ago
- Query, ask and chat with a document-index via transformer models!☆17Updated last year
- The Swarm Ecosystem☆19Updated 6 months ago
- 🔎 A deep-dive into HyDE for Advanced LLM RAG + 💡 Introducing AutoHyDE, a semi-supervised framework to improve the effectiveness, covera…☆30Updated 10 months ago
- YouTube Transcript Cleaner is a simple web-based application that improves the readability of YouTube transcripts.☆25Updated last year
- Probably one of the lightest native RAG + Agent apps out there,experience the power of Agent-powered models and Agent-driven knowledge ba…☆21Updated this week
- Luann allows you to create a LLM agent,which has complete memory module (long-term memory, short-term memory) and knowledge module(Variou…☆18Updated this week
- time based thinking and structure like OpenAI's o1 preview.☆10Updated 5 months ago
- Discover advanced AI techniques in my repository combining Multi-Hop Chain of Thought (CoT) and Retrieval-Augmented Generation (RAG) usin…☆13Updated 6 months ago
- DocAI helps developers quickly build document, image and text processing pipelines using open source and cloud-based machine learning mod…☆19Updated 2 years ago
- Rust bindings for CTranslate2☆14Updated last year
- Datamallet is a python library which contains several helper functions and module for the common tasks in a typical data science workflow…☆11Updated 2 years ago
- Transform unstructured documents into actionable, structured data with enterprise-grade precision and reliability, ready for large-scale …☆17Updated this week
- A multimodal RAG application that enables semantic search on multimedia sources like audio, video and images☆31Updated last year
- Example LangGraph flow that does "competitor analysis" on the web.☆23Updated 8 months ago
- a streaming markdown component for streamlit with LaTeX, Mermaid, Table, code support. A drop-in replacement for st.markdown.☆14Updated 4 months ago
- ProfitPilot closes deals for you effortlessly 24/7, just provide a list of customer and ProfitPilot will reach out on your behalf and clo…☆22Updated last year
- ☆30Updated last year
- ☆14Updated last month
- A framework for high-fidelity retrieval augmented generation in industrial knowledge bases. Integrates jargon identification, context rec…☆28Updated 6 months ago
- ☆37Updated 3 weeks ago
- Unstract's interface to LLMs, Embeddings and VectorDBs.☆18Updated 6 months ago