paulsmith / templatemaker
templatemaker is a Python library that can extract data from files with a similar format, like HTML pages.
☆63Updated 4 years ago
Alternatives and similar repositories for templatemaker:
Users that are interested in templatemaker are comparing it to the libraries listed below
- Extract, parse and populate templates from strings☆27Updated 5 years ago
- MapReduce platform in python☆34Updated 9 years ago
- Python's missing statistical Swiss Army knife☆15Updated 9 years ago
- Regular Expression based parsers for extracting data from natural languages☆70Updated 7 years ago
- A high-performance distributed web crawling & scraping framework written with golang and python.☆30Updated 8 years ago
- Lightweight, multilingual natural language processing☆63Updated 11 years ago
- Pythonic interface to redis-py☆98Updated 7 years ago
- Efficiently search the most similar strings against the query in Python.☆18Updated 6 years ago
- A tiny python utility that converts data crawled from different services into a cloud of words☆30Updated 6 years ago
- A module for querying the DOM tree and writing XPath expressions using native Python syntax.☆127Updated 6 years ago
- unofficial git mirror of http://svn.whoosh.ca svn repo☆49Updated 15 years ago
- Roaring Bitmap in Cython☆80Updated 9 months ago
- A python implementation of DEPTA☆83Updated 8 years ago
- Turn your IPython console into a cross-database SQL client☆31Updated 8 years ago
- A high performance python bloom filter thing.☆44Updated 6 years ago
- Python module (C extension and plain python) implementing DAWG☆20Updated 3 years ago
- 📦 Encapsulated MongoEngine☆19Updated 4 years ago
- Find which links on a web page are pagination links☆29Updated 8 years ago
- A tiny library for Python text normalisation. Useful for ad-hoc text processing.☆147Updated last month
- ☆81Updated 4 years ago
- High Level Kafka Scanner☆19Updated 7 years ago
- Small set of utilities to simplify writing Scrapy spiders.☆49Updated 9 years ago
- Paginating the web☆37Updated 11 years ago
- A Django based search engine powered by CouchDB, celery and whoosh.☆49Updated 9 years ago
- Python implementation of the Parsley language for extracting structured data from web pages☆92Updated 7 years ago
- python-readability, but faster (mirror-ish)☆84Updated 13 years ago
- Modularly extensible semantic metadata validator☆83Updated 9 years ago
- a better repr for closures☆11Updated 8 years ago
- Faster replacement for Python's urlparse module☆46Updated 6 years ago
- A powerful analytics python library for Redis.☆36Updated 9 years ago