raphael-baena / DTLR
Handwritten Text Recognition and Character Detection
☆95Updated this week
Related projects ⓘ
Alternatives and complementary repositories for DTLR
- ☆116Updated 2 months ago
- [ECCV 2024] Reliable and Efficient Concept Erasure of Text-to-Image Diffusion Models☆56Updated 2 weeks ago
- ☆165Updated 4 months ago
- VimTS: A Unified Video and Image Text Spotter☆72Updated this week
- Official GPU implementation of the paper "PPLLaVA: Varied Video Sequence Understanding With Prompt Guidance"☆72Updated this week
- CursorCore: Assist Programming through Aligning Anything☆65Updated 3 weeks ago
- ☆67Updated this week
- Repository for 23'MM accepted paper "Curriculum-Listener: Consistency- and Complementarity-Aware Audio-Enhanced Temporal Sentence Groundi…☆41Updated 10 months ago
- Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model☆234Updated 3 months ago
- ☆188Updated 2 weeks ago
- A matting method that combines dynamic 2D foreground layers and a 3D background model.☆123Updated last year
- official code for "Fox: Focus Anywhere for Fine-grained Multi-page Document Understanding"☆127Updated 5 months ago
- ☆259Updated last week
- A Simple MLLM Surpassed QwenVL-Max with OpenSource Data Only in 14B LLM.☆36Updated 2 months ago
- Analysis of Chinese and English layouts 中英文版面分析☆121Updated 3 weeks ago
- Search, organize, discover anything!☆47Updated 6 months ago
- MuLan: Adapting Multilingual Diffusion Models for 110+ Languages (无需额外训练为任意扩散模型支持多语言能力)☆127Updated 5 months ago
- This is the official implementation of "Flash-VStream: Memory-Based Real-Time Understanding for Long Video Streams"☆128Updated 3 months ago
- The First Multimodal Seach Engine Pipeline and Benchmark for LMMs☆389Updated last month
- 实时语音交互数字人,支持端到端语音方案(GLM-4-Voice - THG)和级联方案(ASR-LLM-TTS-THG)。可自定义形象与音色,无须训练,支持音色克隆,首包延迟低至3s。Real-time voice interactive digital human, su…☆268Updated this week
- 我们是第一个完全可商用的角色大模型。☆35Updated 3 months ago
- ☆145Updated 2 months ago
- official repository of StyleSketch☆64Updated 3 months ago
- 如需体验textin文档解析,请点击https://cc.co/16YSIy☆23Updated 4 months ago
- ☆27Updated 5 months ago
- Vary-tiny codebase upon LAVIS (for training from scratch)and a PDF image-text pairs data (about 600k including English/Chinese)☆68Updated last month
- 📝 针对文档类图像做内容提取,将文档类图像一比一输出到Word或者Txt中,便于进一步使用或处理。后续计划支持输入PDF/图像,输出对应json格式、Txt格式、Word格式和Markdown格式。☆149Updated last week
- [ICCV2023] Segment Every Reference Object in Spatial and Temporal Spaces☆235Updated 10 months ago
- Cookbook for Crafting Good Code☆47Updated 7 months ago
- coze api to openai☆11Updated 2 months ago