klarna / product-page-dataset
☆52Updated 8 months ago
Alternatives and similar repositories for product-page-dataset:
Users that are interested in product-page-dataset are comparing it to the libraries listed below
- [NAACL 2022] TIE: Topological Information Enhanced Structural Reading Comprehension on Web Pages☆19Updated 2 years ago
- It includes two datasets that are used in the downstream tasks for evaluating UIBert: App Similar Element Retrieval data and Visual Item …☆42Updated 3 years ago
- A Context-aware Visual Attention-based training pipeline for Object Detection from a Webpage screenshot!☆92Updated 2 months ago
- [EMNLP 2021] The baseline code for WebSRC dataset.☆50Updated 3 weeks ago
- Index of URLs to pdf files all over the internet and scripts☆23Updated last year
- Learning UI Similarity using Graph Networks☆37Updated 4 years ago
- Release for CHART annotation tools used for ICDAR CHART 2019 competition☆27Updated last year
- ☆13Updated 4 years ago
- CCQA A New Web-Scale Question Answering Dataset for Model Pre-Training☆32Updated 2 years ago
- DialOp: Decision-oriented dialogue environments for collaborative language agents☆106Updated 5 months ago
- ☆40Updated 2 months ago
- SPRINT Toolkit helps you evaluate diverse neural sparse models easily using a single click on any IR dataset.☆45Updated last year
- Finding semantically meaningful and accurate prompts.☆46Updated last year
- The code related to the baselines from NeurIPS 2021 paper "DUE: End-to-End Document Understanding Benchmark."☆36Updated 2 years ago
- This repo contains data and code for the paper "Reasoning over Public and Private Data in Retrieval-Based Systems."☆46Updated 9 months ago
- A Corpus of 475,000 Industrial Occupations☆67Updated 4 years ago
- This repository contains the opensource version of the datasets were used for different parts of training and testing of models that grou…☆32Updated 4 years ago
- ☆39Updated 3 years ago
- ☆29Updated last year
- The dataset contains 3 million attribute-value annotations across 1257 unique categories on 2.2 million cleaned Amazon product profiles. …☆139Updated 2 years ago
- ☆30Updated last year
- Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval (NeurIPS'21)☆44Updated 3 years ago
- Tools for content datamining and NLP at scale☆43Updated 10 months ago
- ☆29Updated last year
- This repository contains the code to reproduce the experiments of the poster "Supervised Contrastive Learning for Product Matching"☆38Updated 3 years ago
- [WSDM 2024] Hierarchical Multimodal Pre-training for Visually Rich Webpage Understanding☆14Updated last year
- Incorporating VIsual LAyout Structures for Scientific Text Classification☆175Updated 2 years ago
- Simplified DOM Trees for Transferable Attribute Extraction from the Web☆38Updated 6 months ago
- Submission archive for the MS MARCO document ranking leaderboard☆29Updated last year
- No Parameter Left Behind: How Distillation and Model Size Affect Zero-Shot Retrieval☆29Updated 2 years ago