kevalmorabia97 / CoVA-Web-Object-DetectionLinks
A Context-aware Visual Attention-based training pipeline for Object Detection from a Webpage screenshot!
☆93Updated 5 months ago
Alternatives and similar repositories for CoVA-Web-Object-Detection
Users that are interested in CoVA-Web-Object-Detection are comparing it to the libraries listed below
Sorting:
- Simplified DOM Trees for Transferable Attribute Extraction from the Web☆38Updated 10 months ago
- It includes two datasets that are used in the downstream tasks for evaluating UIBert: App Similar Element Retrieval data and Visual Item …☆44Updated 4 years ago
- Incorporating VIsual LAyout Structures for Scientific Text Classification☆179Updated 2 years ago
- multimodal document analysis☆165Updated last year
- [NAACL 2022] TIE: Topological Information Enhanced Structural Reading Comprehension on Web Pages☆20Updated 3 years ago
- A fork of Dragnet that also extract author, headline, date, keywords from context, as well as built in metadata extraction all in one pac…☆291Updated 2 months ago
- Completion After Prompt Probability. Make your LLM make a choice☆80Updated 9 months ago
- ☆248Updated 2 years ago
- DocLLM: A layout-aware generative language model for multimodal document understanding☆128Updated last year
- ☆657Updated 2 months ago
- Doc2Graph transforms documents into graphs and exploit a GNN to solve several tasks.☆130Updated 2 years ago
- A curated mobile app design database☆61Updated 3 years ago
- ☆120Updated last year
- ☆32Updated last year
- ReadingBank: A Benchmark Dataset for Reading Order Detection☆107Updated 11 months ago
- Index of URLs to pdf files all over the internet and scripts☆24Updated 2 years ago
- Detectron2 Webserver (Faster-RCNN) implementation for Ubuntu 20.04. Real time object detection served over the internet.☆32Updated 2 years ago
- H&M Fashion Image similarity search with Weaviate and DocArray☆43Updated last year
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆75Updated 9 months ago
- We identify the desiderata for a comprehensive benchmark and propose Visually Rich Document Understanding (VRDU). VRDU contains two datas…☆80Updated 2 years ago
- ScreenQA dataset was introduced in the "ScreenQA: Large-Scale Question-Answer Pairs over Mobile App Screenshots" paper. It contains ~86K …☆122Updated 6 months ago
- RaKUn 2.0 - A fast keyword detection algorithm☆68Updated last week
- Semantic search with embeddings: index anything☆139Updated 3 years ago
- Object Detection for Graphical User Interface: Old Fashioned or Deep Learning or a Combination?☆127Updated last year
- The dataset includes widget captions that describes UI element's functionalities. It is used for training and evaluation of the widget ca…☆22Updated 4 years ago
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.☆108Updated last year
- FastFit ⚡ When LLMs are Unfit Use FastFit ⚡ Fast and Effective Text Classification with Many Classes☆210Updated 3 months ago
- The largest multilingual image-text classification dataset. It contains fashion products.☆73Updated 2 years ago
- The Screen Annotation dataset consists of pairs of mobile screenshots and their annotations. The annotations are in text format, and desc…☆73Updated last year
- A Multilingual Dataset for Parsing Realistic Task-Oriented Dialogs☆116Updated 2 years ago