Travvy88 / DocumentGenerator_DoGeLinks
Synthetic Document Generator for Document AI. Creates document images annotated with text and bounding boxes of each word. Images contain headings, tables, paragraphs with different formatting and fonts. Can be used in OCR, document transformers pretraining, text detection and more other tasks.
☆29Updated 6 months ago
Alternatives and similar repositories for DocumentGenerator_DoGe
Users that are interested in DocumentGenerator_DoGe are comparing it to the libraries listed below
Sorting:
- Tools and agents for automated research.☆48Updated 2 months ago
- Automatic Prompt Optimization Framework☆169Updated last week
- Aggregation framework for annotating datasets in computer vision tasks (detection, segmentation, video captioning etc.)☆11Updated last year
- Библиотека распознавания документов удостоверяющих личность РФ☆41Updated this week
- Scripts and stuff☆18Updated 3 years ago
- Telegram MCP Server and HTTP-MTProto bridge | Multi-user auth, intelligent search, file sending, web setup | Docker & PyPI ready☆20Updated last week
- Modified Arena-Hard-Auto LLM evaluation toolkit with an emphasis on Russian language☆46Updated 10 months ago
- Effective LLM Alignment Toolkit☆152Updated 7 months ago
- Handwritten Text Generation☆17Updated 3 years ago
- Проект языковой модели для проведения морфемного анализа, сегментации и токенизации слов русского языка.☆16Updated last year
- LangChain-compatible integrations with YandexGPT and YandexGPT Embeddings☆43Updated 9 months ago
- GigaChain telegram bot example for technical support☆36Updated last year
- Бенчмарк сравнивает русские аналоги ChatGPT: Saiga, YandexGPT, Gigachat☆61Updated 2 years ago
- ☆47Updated 3 years ago
- MERA (Multimodal Evaluation for Russian-language Architectures) is a new open benchmark for the Russian language for evaluating SOTA mode…☆39Updated 2 weeks ago
- Примеры продвинутого RAG☆40Updated last year
- Hector RAG is a modular RAG framework built on PostgreSQL, offering advanced retrieval methods and fusion techniques for AI-driven applic…☆60Updated 11 months ago
- CLIP implementation for Russian language☆148Updated 2 years ago
- OmniFusion — a multimodal model to communicate using text and images☆234Updated last year
- ☆31Updated last year
- A set of scripts and configurations for pretraining of Large Language Models (LLM)☆36Updated 11 months ago
- Top ML papers of the week.☆45Updated this week
- SAGE: Spelling correction, corruption and evaluation for multiple languages☆164Updated 2 months ago
- Telegram bot for different language models. Supports system prompts and images☆63Updated 7 months ago
- Training and data processing code for Saiga☆54Updated last month
- Thin wrapper around OpenAI Whisper API with streaming support☆86Updated 2 months ago
- LLM-based meme generator with templates☆13Updated 2 months ago
- XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning☆14Updated last year
- ☆22Updated 2 years ago
- Бенчмарк для оценки способности языковых моделей решать математические и физические задачи на русском языке☆23Updated 2 months ago