klarna-incubator / webtraversallibraryLinks
The Web Traversal Library (WTL) is a Python library for abstracting web interactions on top of a base execution layer such as Selenium.
☆72Updated 6 months ago
Alternatives and similar repositories for webtraversallibrary
Users that are interested in webtraversallibrary are comparing it to the libraries listed below
Sorting:
- A python based HTML to text conversion library, command line client and Web service.☆325Updated this week
- A Context-aware Visual Attention-based training pipeline for Object Detection from a Webpage screenshot!☆93Updated 8 months ago
- Extract text from HTML☆134Updated 5 years ago
- python library to simplify working with jsonlines and ndjson data☆304Updated last year
- Article extraction benchmark: dataset and evaluation scripts☆339Updated 2 months ago
- Parse numbers written in natural language☆123Updated last year
- A toolkit for CDX indices such as Common Crawl and the Internet Archive's Wayback Machine☆187Updated 2 weeks ago
- ☆54Updated last year
- A purely-functional HTML builder for Python. Think JSX rather than templates.☆102Updated 10 months ago
- Source code for the paper "Web2Text: Deep Structured Boilerplate Removal", full paper @ ECIR'18☆170Updated 4 years ago
- A fork of Dragnet that also extract author, headline, date, keywords from context, as well as built in metadata extraction all in one pac…☆297Updated 6 months ago
- Python port of Boilerpipe library☆95Updated last year
- Extract price amount and currency symbol from a raw text string☆342Updated last month
- Simplified DOM Trees for Transferable Attribute Extraction from the Web☆40Updated last year
- Production-grade embedding generation, for any length of text, for transformer models.☆23Updated 5 months ago
- Lego AI Parser is an open-source application that uses OpenAI to parse visible text of HTML elements.☆235Updated last year
- MiniWoB++: a web interaction benchmark for reinforcement learning☆351Updated 6 months ago
- Pydantic extension for annotating autocorrecting fields.☆222Updated last year
- λprompt - A functional programming interface for building AI systems☆380Updated last year
- Clean, filter and sample URLs to optimize data collection – Python & command-line – Deduplication, spam, content and language filters☆150Updated 3 weeks ago
- estela, an elastic web scraping cluster 🕸☆191Updated last week
- Pythonic search engine based on PyLucene.☆131Updated 3 weeks ago
- PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolut…☆157Updated 3 years ago
- A Domain Specific Language (DSL) for building language patterns. These can be later compiled into spaCy patterns, pure regex, or any othe…☆68Updated 3 years ago
- Fast and robust date extraction from web pages, with Python or on the command-line☆142Updated 2 weeks ago
- Web content extraction using machine learning☆34Updated 4 years ago
- Python JSON parser for reading JSON objects out of JS files☆47Updated 2 years ago
- News API - fetch news from CommonCrawl, parse with NewsPlease, enrich with pre-trained machine-learning models, to structured searchable …☆29Updated 3 years ago
- Parse natural language time expressions in python☆131Updated 2 years ago
- Show the differences between two strings/text as a compact text, in markdown/HTML, in the terminal and more.☆147Updated 2 weeks ago