iamarunbrahma / vision-parse
Parse PDFs into markdown using Vision LLMs
☆260Updated last week
Alternatives and similar repositories for vision-parse:
Users that are interested in vision-parse are comparing it to the libraries listed below
- Turn local files into a prompt for an LLM☆163Updated last month
- Voice-Enabled Math Tutor Powered by Groq that Calculates and Renders Live Problems and Instruction with LaTeX in Seconds!☆215Updated last month
- PostBot 3000 is an open-source project that shows how to build a powerful AI agent and stream responses and generate artifacts. This proj…☆284Updated 2 months ago
- This repository hosts a suite of specialized agents designed to power your brainstorming sessions. Each agent brings a unique perspective…☆286Updated 3 months ago
- Use OpenAI's realtime API for a chatting with your documents☆313Updated 4 months ago
- Serverless Modal + FastAPI + React + ColPali + Qdrant + GPT4o Vision RAG (V-RAG) Demo☆344Updated 3 months ago
- Assistant for voice-to-blog writing☆126Updated 3 weeks ago
- An opensource implementation of NotebookLM using Deepseek-V3 and PlayHT TTS.☆237Updated last month
- Excalidraw meets ComfyUI for LLMs☆228Updated 2 weeks ago
- A Chrome extension for asking questions over websites☆282Updated 2 weeks ago
- Multi-agent that helps you organize and write documents.☆320Updated 3 months ago
- Dabbling with ReAct chatbots☆174Updated 6 months ago
- SearchGPT / Perplexity Pages clone, but personalised for you.☆235Updated 5 months ago
- OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking☆388Updated last week
- A simple Python program to implement the search-extract-summarize flow.☆253Updated 3 weeks ago
- Gemini Multimodal Live + WebRTC in a single `app.ts`☆180Updated last month
- Structured information extraction from documents☆305Updated 4 months ago
- ☆245Updated this week
- Build realtime voice and video agents with Google's new Gemini 2.0 (API is free for now)☆255Updated last week
- Prompt optimization scratch☆617Updated last week
- On-premises conversational RAG with configurable containers☆407Updated this week
- Realtime API with Firecrawl Tool - Forked from the OpenAI Realtime Console☆153Updated 4 months ago
- ☆56Updated 4 months ago
- Your first AI prompt engineer☆361Updated 3 months ago
- ☆207Updated 4 months ago
- Summarize and query from a lot of heterogeneous documents. Any LLM provider, any filetype, scalable (?), WIP☆331Updated this week
- Your fully proficient, AI-powered and local chatbot assistant🤖☆227Updated 8 months ago
- Yet another open source Perplexity☆420Updated 3 months ago