oztrkoguz / VisQueryPDF
It automatically describes images in PDF files and generates questions from these descriptions. With its advanced RAG structure, it directs these questions directly to PDF text content, providing comprehensive information extraction and analysis.
☆11Updated 4 months ago
Related projects ⓘ
Alternatives and complementary repositories for VisQueryPDF
- This project aims to compare different Retrieval-Augmented Generation (RAG) frameworks in terms of speed and performance.☆12Updated 3 months ago
- Image identification with Kosmos2 model, drawing and cutting bbox with object detection☆18Updated 3 months ago
- This project offers a user-friendly interface that allows users to easily create stories and enrich them with visuals. It supports creati…☆24Updated 6 months ago
- This project is an automated research and summarization tool that allows users to conduct research on a specific question and summarize t…☆13Updated 5 months ago
- ☆21Updated 5 months ago
- Few-Shot Prompting - Chain-of-Thought (CoT) Prompting - Hallucinations - Self-Consistency - Generated Knowledge Prompting - Tree of …☆25Updated last year
- ☆11Updated 7 months ago
- ☆52Updated 7 months ago
- ☆10Updated 9 months ago
- Dream Interpreter inside ComfyUI☆76Updated 3 months ago
- Custom nodes for using fal API. Video generation with Kling, Runway, Luma. Image generation with Flux. LLMs and VLMs OpenAI, Claude, Llam…☆75Updated 2 weeks ago
- ☆19Updated last week
- a node for AuraSR☆22Updated 4 months ago
- A bunch of image manipulation nodes☆50Updated this week
- Slice regions of the canvas and convert them to masks for regional conditions widh PNG preview output. And a few support nodes.☆46Updated 2 weeks ago
- AI-api text generation☆25Updated last month
- a comfyui node for automatic generate image label for running lora or dreambooth training on flux series models☆66Updated 2 months ago
- Comfyui custom node for FunAudioLLM include CosyVoice and SenseVoice☆49Updated this week
- ComfyUI custom node for filtering tags based on categories such as pose, gesture, action, emotion, expression, camera, angle, sensitive, …☆27Updated 2 weeks ago
- Try OmniParser inComfyUI which a simple screen parsing tool towards pure vision based GUI agent.☆28Updated 2 weeks ago
- ☆20Updated 5 months ago
- Unofficial implementation of PhotoMakerV2 for ComfyUI☆13Updated 3 months ago
- Phi-3.5-vision-instruct fast talk with image☆17Updated 3 months ago
- Florence-2 image captioning and tasks☆70Updated 4 months ago
- ComfyUI node for F5-Text To Speech☆29Updated last week
- ☆45Updated 8 months ago
- The successful integration of Qwen2-VL-Instruct into the ComfyUI platform has enabled a smooth operation, supporting (but not limited to)…☆66Updated last month
- Fixed Attention Couple, NegPip(negative weights in prompts) for SDXL and FLUX, more CFG++ and SMEA DY samplers, etc.☆80Updated this week
- ☆22Updated last month
- A prompt helper☆59Updated 3 months ago