Benchmarking Vision-Language Models on OCR tasks in Dynamic Video Environments
☆47Feb 14, 2025Updated last year
Alternatives and similar repositories for ocr-benchmark
Users that are interested in ocr-benchmark are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Give your agents real time desktop perception. Stream screen, microphone, and system audio for live context and actions.☆26Apr 23, 2026Updated last week
- Server-side video workflows for agents: ingest, understand, search, edit, stream.☆74Apr 7, 2026Updated 3 weeks ago
- ☆21Sep 15, 2025Updated 7 months ago
- An open-source agent toolkit that auto-syncs SDK versions, docs, and examples—built for seamless integration with LLMs, and AI agents ( M…☆49Mar 26, 2026Updated last month
- Frontend interface for building chat based system and connecting with agent driven workflows.☆16Sep 19, 2025Updated 7 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Official PyTorch Implementation of "Better Source, Better Flow: Learning Condition-Dependent Source Distribution for Flow Matching"☆31Mar 1, 2026Updated 2 months ago
- A collection of noise designs for diffusion models [Eurographics Tutorial / SIGGRAPH Course, 2025]☆28May 26, 2025Updated 11 months ago
- ☆19Mar 25, 2025Updated last year
- ☆16Apr 7, 2024Updated 2 years ago
- Python package providing functionality and plotting for chemistry method comparison☆16Feb 28, 2024Updated 2 years ago
- VisionGRU: A Linear-Complexity RNN Model for Efficient Image Analysis☆13Dec 26, 2024Updated last year
- ☆11Aug 23, 2024Updated last year
- Instantly create video clips from LLM prompts☆174Aug 22, 2024Updated last year
- Compile simple Shadertoys into small .COM MS-DOS executables☆10May 13, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆20Jun 21, 2024Updated last year
- Official Implementation of "Video Camera Trajectory Editing with Generative Rendering from Estimated Geometry"☆32Nov 10, 2025Updated 5 months ago
- A repo based on XiLin Li's PSGD repo that extends some of the experiments.☆14Oct 7, 2024Updated last year
- re-implementation of instantsplat (unofficial)☆16Aug 5, 2024Updated last year
- ☆33Nov 10, 2025Updated 5 months ago
- [ICASSP2024] An official implement of the paper "EFFICIENT SCENE TEXT IMAGE SUPER-RESOLUTION WITH SEMANTIC GUIDANCE"☆25May 12, 2024Updated last year
- DSPy prompt optimization demo from AI Tinkerers presentation☆18Aug 15, 2025Updated 8 months ago
- ☆10Oct 25, 2020Updated 5 years ago
- Code released for paper titled "MonoTher-Depth: Enhancing Thermal Depth Estimation via Confidence-Aware Distillation"☆17Sep 22, 2025Updated 7 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Official Implementation of "Multi-Granularity Video Object Segmentation" (AAAI 2025)☆24Dec 20, 2024Updated last year
- [ACM MM 2025] LIDAR: Lightweight Adaptive Cue-Aware Fusion Vision Mamba for Multimodal Segmentation of Structural Cracks☆22Nov 18, 2025Updated 5 months ago
- Atari 2600 music demo for @ party 2024. Also contains the TIunA software pitch engine☆12Jun 25, 2024Updated last year
- a python-based framework for creating and testing trading strategies☆46Dec 6, 2025Updated 4 months ago
- Bring your code and propmpts easily to your LLM☆21Jun 10, 2025Updated 10 months ago
- Implemeting Meta AI's VGGT as a FiftyOne Remote Zoo Model☆20Jun 20, 2025Updated 10 months ago
- Refactor your code with local LLM in VSCode☆13Mar 14, 2024Updated 2 years ago
- DeepFake ECG generator.☆16Apr 7, 2026Updated 3 weeks ago
- Official implementation of "InterRVOS: Interaction-aware Referring Video Object Segmentation".☆27Dec 31, 2025Updated 4 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Build tools for LLMs in Rust using Model Context Protocol☆37Feb 25, 2025Updated last year
- A project to take an audio file and separate it into speakers and play it with avatars and save the recording as an mp4 for sharing on so…☆13Nov 6, 2024Updated last year
- This code is published for skyline detection☆29Mar 19, 2026Updated last month
- CLV prediction with pareto-NBD model☆12Jul 1, 2016Updated 9 years ago
- Official Implementation of "Aligned Novel View Image and Geometry Synthesis via Cross-modal Attention Instillation"☆49Jan 29, 2026Updated 3 months ago
- An AI agents framework addressing the two core challenges with real world agents - Optimisation and Deployement☆14Apr 3, 2024Updated 2 years ago
- Flutter + WebAssembly Example☆13Mar 3, 2020Updated 6 years ago