Travvy88 / DocumentGenerator_DoGeLinks
Synthetic Document Generator for Document AI. Creates document images annotated with text and bounding boxes of each word. Images contain headings, tables, paragraphs with different formatting and fonts. Can be used in OCR, document transformers pretraining, text detection and more other tasks.
☆21Updated 2 months ago
Alternatives and similar repositories for DocumentGenerator_DoGe
Users that are interested in DocumentGenerator_DoGe are comparing it to the libraries listed below
Sorting:
- Effective LLM Alignment Toolkit☆131Updated 3 weeks ago
- GigaChain telegram bot example for technical support☆32Updated 5 months ago
- Framework for processing and filtering datasets☆27Updated 10 months ago
- T5-based (russian) text normalization☆21Updated last year
- Aggregation framework for annotating datasets in computer vision tasks (detection, segmentation, video captioning etc.)☆11Updated 7 months ago
- Universal LLM Telegram chatbot in Python☆17Updated 9 months ago
- ☆22Updated last year
- ☆26Updated this week
- Проект языковой модели для проведения морфемного анализа, сегментации и токенизации слов русского языка.☆16Updated 4 months ago
- ☆12Updated 5 months ago
- ☆46Updated last month
- По возможности актуальная информация по ИИ + ресерчи от ChatGPT☆19Updated this week
- Augmentex — a library for augmenting texts with errors☆64Updated 11 months ago
- SAGE: Spelling correction, corruption and evaluation for multiple languages☆153Updated 5 months ago
- Бенчмарк сравнивает русские аналоги ChatGPT: Saiga, YandexGPT, Gigachat☆61Updated last year
- Примеры продвинутого RAG☆35Updated 8 months ago
- A new second practical assignment for Huawei's NLP course☆17Updated last year
- Простой нормализатор текстов перед синтезом речи☆31Updated last year
- Modified Arena-Hard-Auto LLM evaluation toolkit with an emphasis on Russian language☆42Updated 2 months ago
- MERA (Multimodal Evaluation for Russian-language Architectures) is a new open benchmark for the Russian language for evaluating fundament…☆61Updated 8 months ago
- Russian text segmenter and tokenizer☆16Updated 4 years ago
- CLIP implementation for Russian language☆144Updated last year
- ☆31Updated 8 months ago
- A set of scripts and configurations for pretraining of Large Language Models (LLM)☆29Updated 3 months ago
- Russian Text Expansion based on ruGPT3Large☆25Updated 3 years ago
- Handwritten Text Generation☆16Updated 2 years ago
- First-of-its-kind AI benchmark for evaluating the protection capabilities of large language model (LLM) guard systems (guardrails and saf…☆38Updated this week
- Scripts and stuff☆18Updated 2 years ago
- ☆42Updated last week
- ☆8Updated 2 years ago