scrapinghub / number-parser
Parse numbers written in natural language
☆114Updated 6 months ago
Alternatives and similar repositories for number-parser:
Users that are interested in number-parser are comparing it to the libraries listed below
- Common interface for data container classes☆67Updated last month
- Library to populate items using XPath and CSS with a convenient API☆48Updated last month
- Web scraping Page Objects core library☆99Updated 2 months ago
- Extract price amount and currency symbol from a raw text string☆327Updated 2 months ago
- A pure-Python robots.txt parser with support for modern conventions.☆65Updated last month
- Extract text from HTML☆135Updated 4 years ago
- Automatic unit test generation for Scrapy.☆56Updated 3 years ago
- Fast and robust date extraction from web pages, with Python or on the command-line☆126Updated 4 months ago
- Page Object pattern for Scrapy☆121Updated 2 months ago
- A Python implementation of Lunr.js 🌖☆195Updated last month
- universal character encoding detector☆58Updated 7 months ago
- Parsing JavaScript objects into Python data structures☆203Updated last month
- A python based HTML to text conversion library, command line client and Web service.☆303Updated last month
- Python port of Boilerpipe library☆86Updated 8 months ago
- Schema.org classes in pydantic☆67Updated 2 years ago
- A tiny library for Python text normalisation. Useful for ad-hoc text processing.☆151Updated 3 months ago
- Analyze scraped data☆46Updated 5 years ago
- URL normalization for Python☆94Updated last week
- Scrapy middleware to add extra fields to items, like timestamp, response fields, spider attributes etc.☆56Updated 3 years ago
- Parse and convert numbers written in French, English, Spanish, Portuguese, German and Catalan into their digit representation.☆106Updated 2 months ago
- Convert HTML to JSON. Can also (intelligently) convert HTML tables to JSON (using table headers (if available) as keys in the resulting J…☆50Updated last year
- Diff Match Patch is a high-performance library in multiple languages that manipulates plain text.☆52Updated 4 months ago
- Python clients for Zyte AutoExtract API☆40Updated 3 years ago
- Extract place names from a URL or text, and add context to those names -- for example distinguishing between a country, region or city.☆129Updated last year
- Web grep: search all rendered resources used by a URI☆87Updated last month
- A pure Python Levenshtein implementation that's not freaking GPL'd.☆97Updated 2 years ago
- The most basic Text::Unidecode port (licensed under Artistic License or GPL or GPLv2+ - choose whatever you want)☆66Updated 2 years ago
- Detect and classify pagination links☆102Updated 4 years ago
- Python3 bindings for the Compact Language Detector v3 (CLD3)☆151Updated last year
- Python wrapper for RE2☆103Updated 3 weeks ago