kevalmorabia97 / CoVA-Web-Object-Detection
A Context-aware Visual Attention-based training pipeline for Object Detection from a Webpage screenshot!
☆92Updated 3 weeks ago
Alternatives and similar repositories for CoVA-Web-Object-Detection:
Users that are interested in CoVA-Web-Object-Detection are comparing it to the libraries listed below
- Simplified DOM Trees for Transferable Attribute Extraction from the Web☆38Updated 5 months ago
- [NAACL 2022] TIE: Topological Information Enhanced Structural Reading Comprehension on Web Pages☆19Updated 2 years ago
- It includes two datasets that are used in the downstream tasks for evaluating UIBert: App Similar Element Retrieval data and Visual Item …☆41Updated 3 years ago
- Unofficial Pytorch implementation of Dom-LM paper.☆33Updated 2 years ago
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆76Updated 5 months ago
- SIGIR-2022 Webformer: Pre-training with Web Pages for Information Retrieval☆47Updated 2 years ago
- Index of URLs to pdf files all over the internet and scripts☆22Updated last year
- ☆243Updated 2 years ago
- Streamlit Named Entity Recognition (NER) annotation custom component☆38Updated 2 years ago
- ☆32Updated 11 months ago
- ☆111Updated last year
- Completion After Prompt Probability. Make your LLM make a choice☆74Updated 4 months ago
- Incorporating VIsual LAyout Structures for Scientific Text Classification☆175Updated 2 years ago
- ReadingBank: A Benchmark Dataset for Reading Order Detection☆104Updated 6 months ago
- ☆52Updated 7 months ago
- DocLLM: A layout-aware generative language model for multimodal document understanding☆123Updated last year
- RaKUn 2.0 - A fast keyword detection algorithm☆66Updated last month
- ☆182Updated last year
- We identify the desiderata for a comprehensive benchmark and propose Visually Rich Document Understanding (VRDU). VRDU contains two datas…☆79Updated 2 years ago
- A brand tagging system in product titles and user generated text☆35Updated 3 years ago
- The Screen Annotation dataset consists of pairs of mobile screenshots and their annotations. The annotations are in text format, and desc…☆63Updated last year
- Build Semantic Search with S-BERT and Fine-tune your model in unsupervised way☆58Updated 2 years ago
- Implementation of DocFormer: End-to-End Transformer for Document Understanding, a multi-modal transformer based architecture for the task…☆269Updated 2 years ago
- Google Deepmind's PromptBreeder for automated prompt engineering implemented in langchain expression language.☆98Updated 7 months ago
- Experiments with generating opensource language model assistants☆97Updated last year
- Simply, faster, sentence-transformers☆141Updated 6 months ago
- Seahorse is a dataset for multilingual, multi-faceted summarization evaluation. It consists of 96K summaries with human ratings along 6 q…☆87Updated last year
- Object Detection for Graphical User Interface: Old Fashioned or Deep Learning or a Combination?☆127Updated last year
- Semantic search through a vectorized Wikipedia (SentenceBERT) with the Weaviate vector search engine☆242Updated last year
- multimodal document analysis☆164Updated 9 months ago