MurtuzaBohra/SimpDOM

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/MurtuzaBohra/SimpDOM)

MurtuzaBohra / SimpDOM

Simplified DOM Trees for Transferable Attribute Extraction from the Web

☆43

Alternatives and similar repositories for SimpDOM

Users that are interested in SimpDOM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ilyalasy / DOM-LM
View on GitHub
Unofficial Pytorch implementation of Dom-LM paper.
☆35Mar 6, 2023Updated 3 years ago
MartinCastroAlvarez / html2vec
View on GitHub
Algorithm that converts an HTML to a vectorized object suitable for neural networks.
☆14Nov 2, 2020Updated 5 years ago
scrapinghub / product-extraction-benchmark
View on GitHub
☆16Apr 10, 2026Updated 3 months ago
X-LANCE / WebSRC-Baseline
View on GitHub
[EMNLP 2021] The baseline code for WebSRC dataset.
☆51Apr 2, 2025Updated last year
redreamality / webke
View on GitHub
Knowledge extraction from semi-structured web.
☆13Mar 25, 2024Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
cdlockard / expanded_swde
View on GitHub
☆14Apr 18, 2020Updated 6 years ago
xrr233 / Webformer
View on GitHub
SIGIR-2022 Webformer: Pre-training with Web Pages for Information Retrieval
☆50Sep 20, 2022Updated 3 years ago
MovePhilip / Webformer
View on GitHub
unofficial impelement of the webformer: The Web-page Transformer for Structure Information Extraction
☆13Apr 20, 2023Updated 3 years ago
stanford-oval / schema2qa
View on GitHub
Schema2QA Question Answering Dataset
☆19Aug 22, 2022Updated 3 years ago
Yiwen-Yang-666 / GAN-BERT-CRF
View on GitHub
An idea that take advantages of features of deep learning to use unannotated samples for NER and identify sequences with error labels.
☆15Feb 4, 2024Updated 2 years ago
jjonescz / awe
View on GitHub
AI-based web extractor
☆12Feb 25, 2023Updated 3 years ago
lumiqai / UOI-1806.01264
View on GitHub
Unofficial implementation of the paper "OpenTag: Open Attribute Value Extraction from Product Profiles"
☆33Aug 22, 2018Updated 7 years ago
sdpmas / Scotch
View on GitHub
In-IDE Code Search
☆29Apr 29, 2022Updated 4 years ago
ir-ischool-uos / mwpd
View on GitHub
☆13Sep 28, 2020Updated 5 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
kutschkem / QGen
View on GitHub
Question generation from text
☆15Sep 19, 2012Updated 13 years ago
webis-de / ecir21-an-empirical-comparison-of-web-page-segmentation-algorithms
View on GitHub
☆25Jul 25, 2024Updated 2 years ago
Petroniuss / groupcache
View on GitHub
groupcache is a distributed caching and cache-filling library ported from Go to Rust.
☆17Feb 2, 2026Updated 5 months ago
vinid / prodb
View on GitHub
☆18Sep 16, 2022Updated 3 years ago
zzaebok / AppUsage2Vec
View on GitHub
AppUsage2Vec - Pytorch Implementation
☆12Apr 22, 2021Updated 5 years ago
natureLanguageQing / new_energy_relation_center
View on GitHub
企业事件抽取
☆13May 20, 2021Updated 5 years ago
jsrl / aemet-elt
View on GitHub
ELT for AEMET weather data.
☆16Mar 23, 2025Updated last year
X-LANCE / TIE
View on GitHub
[NAACL 2022] TIE: Topological Information Enhanced Structural Reading Comprehension on Web Pages
☆22Jun 3, 2022Updated 4 years ago
SkyRiver-2000 / TRAD-Official
View on GitHub
[SIGIR 2024] TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned Decision
☆20Mar 28, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
winter1203 / vllm_GOT2_OCR
View on GitHub
Accelerating GOT-OCRv2 with VLLM
☆10Nov 15, 2024Updated last year
imc-trading / telerista
View on GitHub
☆18Apr 4, 2020Updated 6 years ago
allenai / ask4help
View on GitHub
Code for the Ask4Help project
☆22Nov 24, 2022Updated 3 years ago
xnancy / russ
View on GitHub
☆16Apr 9, 2021Updated 5 years ago
shinleylee / Arbitrary_Distribution_Modeling
View on GitHub
Arbitrary Distribution Modeling with Censorship in Real Time Bidding Advertising for KDD'22
☆16Mar 9, 2022Updated 4 years ago
facebookresearch / comet_memory_dialog
View on GitHub
Code for Navigating Connected Memories with a Task-oriented Dialog System
☆18Dec 12, 2022Updated 3 years ago
natureLanguageQing / datafountain_news
View on GitHub
baseline分享-互联网新闻情感分析
☆11Oct 12, 2019Updated 6 years ago
chanind / linear-relational
View on GitHub
Linear Relational Embeddings (LREs) and Linear Relational Concepts (LRCs) for LLMs in PyTorch
☆11Aug 7, 2024Updated last year
Hazem-Ben-Khalfallah / test-cherry
View on GitHub
An Intellij Plugin that generates unit test methods with meaningful names based in described behaviours with @should tags in methods ja…
☆10Dec 14, 2025Updated 7 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
HKUST-KnowComp / ComHyper
View on GitHub
[EMNLP2020] When Hearst Is not Enough: Improving Hypernymy Detection from Corpus with Distributional Models
☆11Nov 10, 2020Updated 5 years ago
CarperAI / CodeReviewSE
View on GitHub
Stuff related to scraping the Code Review StackExchange
☆11Jan 19, 2023Updated 3 years ago
Wokstym / react-vis-graph-wrapper
View on GitHub
A React component to display beautiful network graphs using vis.js
☆15Mar 6, 2022Updated 4 years ago
Wang-Shuo / SIGIR2020Challenge
View on GitHub
The 1st place solution for SIGIR 2020 E-Commerce Workshop Multimodal Product Classification Challenge
☆21Aug 3, 2020Updated 5 years ago
kdavila / ChartInfo_annotation_tools
View on GitHub
Release for CHART annotation tools used for ICDAR CHART 2019 competition
☆29Sep 15, 2023Updated 2 years ago
circularfashion / cf-circularity-id-standard
View on GitHub
The circularity.ID Open Data Standard. The standard represents the results and findings of an extensive six-year research into the needs …
☆23Nov 30, 2023Updated 2 years ago
IBPA / LOVE
View on GitHub
Learning Ontologies Via Embeddings
☆12Jul 6, 2023Updated 3 years ago