kaqqao / nutch-element-selectorLinks
Nutch 2.3.1 plugin for whitelisting/blacklisting specific HTML elements
☆14Updated 3 years ago
Alternatives and similar repositories for nutch-element-selector
Users that are interested in nutch-element-selector are comparing it to the libraries listed below
Sorting:
- An academic open source and open data web crawler☆27Updated 7 years ago
- MetaSync☆20Updated 9 years ago
- Document Imaging Archive System. Home document imaging, with OCR. Scan documents (with SANE) or import ODF documents, assign tags. Use op…☆25Updated 10 years ago
- manage a repository of 0install feeds☆18Updated last year
- This repository is outdated and will be discontinued. For latest code and information check: http://github.com/gpgmail/GPGMail☆54Updated 7 years ago
- A powerful Pacman (Package Manager) frontend using Qt libs☆11Updated 10 years ago
- Performance dashboard☆19Updated this week
- fuzzydb is a fuzzy matching database engine capable of providing human-like search results that make life much easier for users of websit…☆20Updated 2 years ago
- Highly performant version of open-text-summarizer☆38Updated 11 years ago
- Compiler for writing DeepDive applications in a Datalog-like language — ⚠️🚧🛑 REPO MOVED TO DEEPDIVE 👇🏿☆19Updated 8 years ago
- Build simple social graphs for GitHub☆15Updated 10 years ago
- Safely print shared library dependencies (similar to ldd)☆23Updated 2 years ago
- A semantic analysis tool to generate synonym.txt files for Solr. [RETIRED]☆24Updated 8 years ago
- Collects multimedia content shared through social networks.☆19Updated 10 years ago
- Advanced FireWall cookbook for Chef and Linux that uses Iptables and to dynamically configure inbound and outbound rules on each node.☆40Updated 10 years ago
- Crawl and render JavaScript templates.☆10Updated 7 years ago
- Treat curl configuration files as curlrc subcommands.☆11Updated 4 years ago
- Block median value perceptual hash RFC for URN namespace☆27Updated 5 years ago
- ☆7Updated 6 years ago
- Digital signatures to guarantee integrity and authenticity of collections of records.☆12Updated 3 years ago
- Chambua is an open-source semantic tagging application that analyses text and extracts names of people, places (& geocodes them), organis…☆33Updated 3 years ago
- Masques is a distributed social network.☆36Updated 9 years ago
- IETF Drafts and Standards☆13Updated 6 years ago
- run multiple shell commands in parallel and coordinate their output☆31Updated 13 years ago
- HTTP Shell is a CLI tool based on the Kui framework that provides developers a modern alternative to http clients for interacting with AP…☆12Updated 4 years ago
- Use YAML to define your environment☆10Updated 8 years ago
- SurveyMan programming language.☆46Updated 8 years ago
- ☆10Updated 7 years ago
- The asm.js benchmark☆48Updated 7 years ago
- Shell script to copy an entire website☆11Updated 6 years ago