Easily crawl news portals or blog sites using Storm Crawler.
☆21Nov 15, 2022Updated 3 years ago
Alternatives and similar repositories for crawling-framework
Users that are interested in crawling-framework are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Multilingual library to easily parse date strings to java.util.Date objects.☆31Sep 4, 2019Updated 6 years ago
- Integration between Reaction ECommerce and Accelerated Text to provide product descriptions for an e-shop.☆13Feb 22, 2021Updated 5 years ago
- Beagle helps you identify keywords, phrases, regexes, and complex search queries of interest in streams of text documents.☆54Jun 30, 2021Updated 4 years ago
- Leiningen template for AWS Lambda custom runtime with GraalVM native image compiled Clojure projects.☆45Oct 5, 2020Updated 5 years ago
- Accelerated Text is a no-code natural language generation platform. It will help you construct document plans which define how your data …☆808Mar 10, 2023Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Clojure wrapper for the `jackson-jq `. Embed `jq` scripts into your app. Compatible with GraalVM native-image.☆21Sep 29, 2023Updated 2 years ago
- Dataset of Lithuania legal entities☆13Nov 21, 2023Updated 2 years ago
- Convert a number to an approximated text expression: from '0.23' to 'less than a quarter'.☆201Jan 20, 2021Updated 5 years ago
- A Domain Specific Language (DSL) for building language patterns. These can be later compiled into spaCy patterns, pure regex, or any othe…☆69Updated this week
- Repeat statsd packets to riemann☆17Jan 1, 2015Updated 11 years ago
- Storm / Solr Integration☆19Feb 2, 2024Updated 2 years ago
- A Java library that can do URL normalization, unshorten URL, and URL extraction.☆19Oct 19, 2017Updated 8 years ago
- Clojure LLM - Dataset curation for fine tuning an LLM for Clojure.☆17Jun 12, 2023Updated 2 years ago
- A simple accounting applications for Django☆19Sep 13, 2013Updated 12 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- A http proxy demo written with Rust/Tokio☆16Sep 17, 2020Updated 5 years ago
- API definition, resources and reference implementation of URL Frontiers☆59Jan 23, 2026Updated 2 months ago
- Mavuno: A Hadoop-Based Text Mining Toolkit☆47Feb 7, 2012Updated 14 years ago
- Convert powerpoint (pptx) files into raw text org or LaTeX files☆15Aug 28, 2018Updated 7 years ago
- Snowball Stemmer for Clojure☆18Jun 7, 2022Updated 3 years ago
- Zulia Search Engine☆36Apr 10, 2026Updated last week
- Timestone enables you to create deterministic and easy-to-understand unit tests for time-dependent, concurrent Go code.☆16Apr 21, 2025Updated 11 months ago
- A robots.txt parser written in Clojure.☆16Dec 15, 2011Updated 14 years ago
- Human Friendly Way to look for Kubernetes Events☆30Mar 24, 2026Updated 3 weeks ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- w3act is an annotation and curation tool for building web archive collections☆21Jan 30, 2024Updated 2 years ago
- A set of machine learning experiments in Clojure☆30Nov 30, 2012Updated 13 years ago
- GraalVM GitHub action☆13Jun 25, 2022Updated 3 years ago
- Simple background tasks for Django☆21Jun 4, 2023Updated 2 years ago
- Tools for Lithuanian language processing☆16Jun 15, 2016Updated 9 years ago
- Chinese Tokenizer module for Python☆16Jul 3, 2018Updated 7 years ago
- Linguistic slovak stemmer based on Lucene stemmers☆11Apr 15, 2016Updated 10 years ago
- The Clojure programming language☆15Dec 11, 2025Updated 4 months ago
- The Solr Package Directory and Sanctuary☆13Oct 14, 2025Updated 6 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Lietuvos atvirų duomenų katalogas (data.gov.lt).☆20Updated this week
- Quarkus Tika extension☆14Apr 8, 2026Updated last week
- Scrape financial News from Yahoo and analyse the sentiment (PoC)☆20Jul 16, 2019Updated 6 years ago
- Prototype SOLR-powered web archive exploration UI.☆43Jun 3, 2020Updated 5 years ago
- Github Action to run clojure.test by Babashka☆14Dec 29, 2021Updated 4 years ago
- A tool to assign Sustainable Development Goals to a scientific abstract☆18Feb 25, 2021Updated 5 years ago
- Application configuration and scripts for search on https://docs.vespa.ai/☆12Mar 27, 2026Updated 3 weeks ago