kevalmorabia97 / CoVA-Web-Object-Detection
A Context-aware Visual Attention-based training pipeline for Object Detection from a Webpage screenshot!
☆88Updated last year
Related projects: ⓘ
- It includes two datasets that are used in the downstream tasks for evaluating UIBert: App Similar Element Retrieval data and Visual Item …☆41Updated 3 years ago
- Simplified DOM Trees for Transferable Attribute Extraction from the Web☆36Updated last year
- A fork of Dragnet that also extract author, headline, date, keywords from context, as well as built in metadata extraction all in one pac…☆216Updated 8 months ago
- Object Detection for Graphical User Interface: Old Fashioned or Deep Learning or a Combination?☆119Updated 7 months ago
- Incorporating VIsual LAyout Structures for Scientific Text Classification☆166Updated last year
- ☆29Updated 5 months ago
- Unofficial Pytorch implementation of Dom-LM paper.☆30Updated last year
- ☆49Updated last month
- ☆26Updated last month
- Semantic search with embeddings: index anything☆136Updated 2 years ago
- The dataset includes screen summaries that describes Android app screenshot's functionalities. It is used for training and evaluation of …☆44Updated 3 years ago
- ☆316Updated 8 months ago
- We identify the desiderata for a comprehensive benchmark and propose Visually Rich Document Understanding (VRDU). VRDU contains two datas…☆73Updated last year
- Object Detection Model for Scanned Documents☆77Updated 11 months ago
- My implementation of Kosmos2.5 from the paper: "KOSMOS-2.5: A Multimodal Literate Model"☆68Updated last week
- Doc2Graph transforms documents into graphs and exploit a GNN to solve several tasks.☆113Updated last year
- Vision Document Retrieval (ViDoRe): Benchmark. Evaluation code for the ColPali paper.☆101Updated last week
- DocLLM: A layout-aware generative language model for multimodal document understanding☆109Updated 8 months ago
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆64Updated 2 months ago
- Implementation of Microsoft Vips algorithm in Python☆19Updated 4 years ago
- FastFit ⚡ When LLMs are Unfit Use FastFit ⚡ Fast and Effective Text Classification with Many Classes☆180Updated last month
- Completion After Prompt Probability. Make your LLM make a choice☆68Updated last week
- The dataset includes widget captions that describes UI element's functionalities. It is used for training and evaluation of the widget ca…☆16Updated 3 years ago
- This Repository consists of all my experiments performed on LayoutLMv3 model.☆27Updated 2 years ago
- ☆37Updated 3 years ago
- SIGIR-2022 Webformer: Pre-training with Web Pages for Information Retrieval☆47Updated 2 years ago
- This is the repository for our paper "INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning"☆195Updated 4 months ago
- Implementation of DocFormer: End-to-End Transformer for Document Understanding, a multi-modal transformer based architecture for the task…☆253Updated last year
- DocLayNet: A Large Human-Annotated Dataset for Document-Layout Analysis☆235Updated last year
- ☆237Updated last year