YongWookHa / im2latex
Image to LaTeX pytorch model
☆13Updated last year
Alternatives and similar repositories for im2latex:
Users that are interested in im2latex are comparing it to the libraries listed below
- This repo is used to release the ArxivFormula dataset.☆25Updated 2 months ago
- Official repository of the paper: "A Comprehensive Gold Standard and Benchmark for Comics Text Detection and Recognition"☆25Updated last year
- A GPT-based generative LM for combined text and math formulas, leveraging tree-based formula encoding.☆33Updated last year
- An unofficial Implementation of DocParser: End-to-end OCR-free Information Extraction from Visually Rich Documents☆36Updated last year
- ☆38Updated last year
- Tools for content datamining and NLP at scale☆42Updated 7 months ago
- A library for simplifying fine tuning with multi gpu setups in the Huggingface ecosystem.☆16Updated 2 months ago
- Implementation of VisionLLaMA from the paper: "VisionLLaMA: A Unified LLaMA Interface for Vision Tasks" in PyTorch and Zeta☆16Updated 2 months ago
- ☆9Updated last year
- Python and JS tools to generate Printed LaTex formulas and images☆15Updated last year
- DocBankLoader is a dataset loader for DocBank, and can convert DocBank to the Object Detection models' format.☆23Updated 3 years ago
- Implementation of a Transformer using ReLA (Rectified Linear Attention) from https://arxiv.org/abs/2104.07012☆49Updated 2 years ago
- Trying to deconstruct RWKV in understandable terms☆14Updated last year
- ☆12Updated 8 months ago
- Datasets and Evaluation Scripts for CompHRDoc☆31Updated 9 months ago
- An implementation of Tiling and Corruption (TACo) Augmentations for OCR/HTR☆15Updated 3 years ago
- Exploration into the Firefly algorithm in Pytorch☆33Updated 4 months ago
- WikiTableSet: A largest publicly available image-based table recognition dataset in three languages built from Wikipedia☆27Updated last year
- High-Performance Transformers for Table Structure Recognition Need Early Convolutions☆42Updated 9 months ago
- DocReal: Robust Document Dewarping of Real-Life Images via Attention-Enhanced Control Point Prediction☆17Updated last year
- A multimodal large-scale model, which performs close to the closed-source Qwen-VL-PLUS on many datasets and significantly surpasses the p…☆14Updated 11 months ago
- Download full or partial git-lfs repos without temporarily using 2x disk space☆30Updated last year
- Chinese Mathematical Formula Detection (MFD) Dataset 中文文档数学公式检测数据集☆32Updated 2 years ago
- imagetokenizer is a python package, helps you encoder visuals and generate visuals token ids from codebook, supports both image and video…☆30Updated 6 months ago
- A dashboard for exploring timm learning rate schedulers☆19Updated last month
- Implementation of the DocLLM paper for Llama models.☆12Updated last month
- VimTS: A Unified Video and Image Text Spotter☆75Updated 2 months ago
- Question Answering dataset generator of Document Visual in English and Chinese☆24Updated last year