Travvy88 / DocumentGenerator_DoGeLinks
Synthetic Document Generator for Document AI. Creates document images annotated with text and bounding boxes of each word. Images contain headings, tables, paragraphs with different formatting and fonts. Can be used in OCR, document transformers pretraining, text detection and more other tasks.
☆29Updated 4 months ago
Alternatives and similar repositories for DocumentGenerator_DoGe
Users that are interested in DocumentGenerator_DoGe are comparing it to the libraries listed below
Sorting:
- Tools and agents for automated research.☆47Updated 2 weeks ago
- Training and data processing code for Saiga☆53Updated 5 months ago
- GigaChain telegram bot example for technical support☆37Updated 11 months ago
- ☆12Updated 2 years ago
- Проект языковой модели для проведения морфемного анализа, сегментации и токенизации слов русского языка.☆16Updated 11 months ago
- Scripts and stuff☆18Updated 2 years ago
- Modified Arena-Hard-Auto LLM evaluation toolkit with an emphasis on Russian language☆45Updated 9 months ago
- Effective LLM Alignment Toolkit☆150Updated 5 months ago
- MERA (Multimodal Evaluation for Russian-language Architectures) is a new open benchmark for the Russian language for evaluating SOTA mode…☆38Updated 2 months ago
- Automatic Prompt Optimization Framework☆61Updated this week
- Библиотека распознавания документов удостоверяющих личность РФ☆35Updated 7 months ago
- OmniFusion — a multimodal model to communicate using text and images☆234Updated last year
- LangChain-compatible integrations with YandexGPT and YandexGPT Embeddings☆44Updated 7 months ago
- Automated machine learning for text classification☆46Updated last month
- LLM-based meme generator with templates☆13Updated 2 weeks ago
- StealthMessage is a designed platform that allows users to send and receive anonymous messages without revealing its identity.☆17Updated last year
- SirChatalot is a Telegram bot leveraging ChatGPT, Claude or YandexGPT. It uses Whisper for speech-to-text and DALL-E, Stability AI or Yan…☆72Updated 9 months ago
- Aggregation framework for annotating datasets in computer vision tasks (detection, segmentation, video captioning etc.)☆11Updated last year
- GigaAgent — это универсальный агент-оркестратор для решения широкого круга задач (ReAct + REPL)☆134Updated this week
- AI-powered text compression tool that condenses content while preserving meaning across multiple formats.☆22Updated last year
- Top ML papers of the week.☆42Updated this week
- OpenAPI-like API-server for voice generation (TTS) based on fish-speech-1.5 model.☆28Updated 6 months ago
- Hector RAG is a modular RAG framework built on PostgreSQL, offering advanced retrieval methods and fusion techniques for AI-driven applic…☆59Updated 9 months ago
- Boost your efficiency with Fish Speech Batch Inference. Easily process multiple texts and achieve consistently great results. 🗨️🐟☆24Updated 4 months ago
- Telegram bot for different language models. Supports system prompts and images☆63Updated 5 months ago
- Примеры продвинутого RAG☆40Updated last year
- Embedding Studio is a framework which allows you transform your Vector Database into a feature-rich Search Engine.☆383Updated 7 months ago
- XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning☆14Updated last year
- Бенч марк сравнивает русские аналоги ChatGPT: Saiga, YandexGPT, Gigachat☆61Updated 2 years ago
- Right click text for AI chat☆49Updated 3 months ago