video-db / ocr-benchmarkLinks
Benchmarking Vision-Language Models on OCR tasks in Dynamic Video Environments
☆45Updated 8 months ago
Alternatives and similar repositories for ocr-benchmark
Users that are interested in ocr-benchmark are comparing it to the libraries listed below
Sorting:
- ☆105Updated this week
- Gradio UI for a Cog API☆69Updated last year
- Useful resources for LLM-based Diarization and Transcription.☆55Updated last year
- Build AI Agents with Your Existing Python Code!☆67Updated last year
- 🐮📢 The first AI voice assistant that interrupts *you*☆148Updated last year
- ☆116Updated 10 months ago
- An automated tool for discovering insights from research papaer corpora☆138Updated last year
- MLX port for xjdr's entropix sampler (mimics jax implementation)☆62Updated 11 months ago
- Simple program to manually caption your images (or any other file types) so you can use them for AI training☆37Updated 2 years ago
- Turn text from websites into spoken audio with edge-tts, F5, etc. and save as mp3 files☆46Updated 4 months ago
- AnyModal is a Flexible Multimodal Language Model Framework for PyTorch☆102Updated 10 months ago
- ☆54Updated last month
- Build Web Datasets with Ease☆33Updated last year
- ☆73Updated 9 months ago
- VLM driven tool that processes surveillance videos, extracts frames, and generates insightful annotations using a fine-tuned Florence-2 V…☆125Updated 4 months ago
- Using the moondream VLM with optical flow for promptable object tracking☆72Updated 8 months ago
- The next evolution of Agents☆47Updated this week
- Extract information, summarize, ask questions, and search videos using OpenAI's Vision API 🚀🎦☆61Updated last year
- GRDN.AI app for garden optimization☆70Updated last year
- auto fine tune of models with synthetic data☆75Updated last year
- ☆22Updated 5 months ago
- Gradio based tool to run opensource LLM models directly from Huggingface☆96Updated last year
- Interactive timeline of AI history☆62Updated last month
- A streamlined implementation of Grounding DINO and SAM for advanced image segmentation. This lightweight solution simplifies the integrat…☆64Updated last year
- ☆17Updated 10 months ago
- tiny_fnc_engine is a minimal python library that provides a flexible engine for calling functions extracted from a LLM.☆38Updated last year
- Cerule - A Tiny Mighty Vision Model☆67Updated last year
- ☆47Updated last year
- Benchmark that evaluates LLMs using 759 NYT Connections puzzles extended with extra trick words☆155Updated 2 weeks ago
- How to use bounding boxes with the Gemini API☆106Updated last year