Easily crawl news portals or blog sites using Storm Crawler.
☆21Nov 15, 2022Updated 3 years ago
Alternatives and similar repositories for crawling-framework
Users that are interested in crawling-framework are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Integration between Reaction ECommerce and Accelerated Text to provide product descriptions for an e-shop.☆13Feb 22, 2021Updated 5 years ago
- Beagle helps you identify keywords, phrases, regexes, and complex search queries of interest in streams of text documents.☆54Jun 30, 2021Updated 4 years ago
- Accelerated Text is a no-code natural language generation platform. It will help you construct document plans which define how your data …☆806Mar 10, 2023Updated 3 years ago
- Dataset of Lithuania legal entities☆13Nov 21, 2023Updated 2 years ago
- Convert a number to an approximated text expression: from '0.23' to 'less than a quarter'.☆201Jan 20, 2021Updated 5 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Python wrapper for Accelerated Text☆12Oct 5, 2021Updated 4 years ago
- A Domain Specific Language (DSL) for building language patterns. These can be later compiled into spaCy patterns, pure regex, or any othe…☆68Feb 18, 2026Updated last month
- Repeat statsd packets to riemann☆17Jan 1, 2015Updated 11 years ago
- Storm / Solr Integration☆19Feb 2, 2024Updated 2 years ago
- Opinionated command line argument handling, with excellent support for subcommands☆48Oct 27, 2025Updated 5 months ago
- A Java library that can do URL normalization, unshorten URL, and URL extraction.☆19Oct 19, 2017Updated 8 years ago
- A simple accounting applications for Django☆19Sep 13, 2013Updated 12 years ago
- A http proxy demo written with Rust/Tokio☆16Sep 17, 2020Updated 5 years ago
- Run Mattermost on Heroku☆35Sep 19, 2022Updated 3 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- GOPHI: an AMR-to-English Verbalizer☆11Feb 5, 2020Updated 6 years ago
- Ready-to-use examples of dkpro-core components and pipelines.☆35Dec 16, 2023Updated 2 years ago
- Zulia Search Engine☆36Updated this week
- Timestone enables you to create deterministic and easy-to-understand unit tests for time-dependent, concurrent Go code.☆16Apr 21, 2025Updated 11 months ago
- StrapiRuby is a Ruby wrapper gem around Strapi REST API. #hacktoberfest☆14Jul 20, 2025Updated 8 months ago
- A robots.txt parser written in Clojure.☆16Dec 15, 2011Updated 14 years ago
- Human Friendly Way to look for Kubernetes Events☆30Updated this week
- w3act is an annotation and curation tool for building web archive collections☆21Jan 30, 2024Updated 2 years ago
- ☆13Oct 16, 2025Updated 5 months ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Tooling to build LLM applications: prompt templating and composition, agents, LLM memory, and other instruments for builders of AI applic…☆366Jan 8, 2026Updated 2 months ago
- GraalVM GitHub action☆13Jun 25, 2022Updated 3 years ago
- Simple background tasks for Django☆21Jun 4, 2023Updated 2 years ago
- Chinese Tokenizer module for Python☆16Jul 3, 2018Updated 7 years ago
- Helpers to extend Django Admin with data from external service with minimal hacks☆25Aug 25, 2025Updated 7 months ago
- Linguistic slovak stemmer based on Lucene stemmers☆11Apr 15, 2016Updated 9 years ago
- Demo using Apache Lucene has a reverse geocoder, running as a CLI app via Graal, AWS Lambda or Google Cloud Run☆12Apr 20, 2021Updated 4 years ago
- The Solr Package Directory and Sanctuary☆13Oct 14, 2025Updated 5 months ago
- Lietuvos atvirų duomenų katalogas (data.gov.lt).☆20Updated this week
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Quarkus Tika extension☆13Feb 19, 2026Updated last month
- A graph database built on top of mysql using a node adjacency table and joins for traversal.☆19May 9, 2010Updated 15 years ago
- Scrape financial News from Yahoo and analyse the sentiment (PoC)☆20Jul 16, 2019Updated 6 years ago
- ArchiveWeb.page Express!☆14Nov 1, 2024Updated last year
- Przyjazne forum webmasterskie☆14Jun 28, 2016Updated 9 years ago
- A collection of GGUF and quantizations for jina-embeddings-v4☆34Sep 18, 2025Updated 6 months ago
- Docker Compose based system for running remote browsers (including Flash and Java support) connected to web archives☆16Jun 10, 2021Updated 4 years ago