bejean / crawl-anywhereView external linksLinks
Crawl-Anywhere - Web Crawler and document processing pipeline with Solr integration.
☆98Jul 1, 2017Updated 8 years ago
Alternatives and similar repositories for crawl-anywhere
Users that are interested in crawl-anywhere are comparing it to the libraries listed below
Sorting:
- Blog crawler for the blogforever project.☆23Jan 31, 2014Updated 12 years ago
- Fureteur is a simple, configurable, fault-tolerant web crawler written is Scala☆28Oct 14, 2014Updated 11 years ago
- An online sentiment analyzer built with Flask and TextBlob☆15Sep 3, 2013Updated 12 years ago
- A simple library for loading word2vec binary model.☆12Sep 17, 2015Updated 10 years ago
- Android Tracks☆30Apr 28, 2022Updated 3 years ago
- ☆66Dec 11, 2016Updated 9 years ago
- The easiest way to get started with React.js development☆11Jul 29, 2016Updated 9 years ago
- extensible Web Retrieval Toolkit☆17Jun 2, 2022Updated 3 years ago
- Evolving expressions using genetic algorithms☆17Jun 20, 2020Updated 5 years ago
- modular NL platform for dialogue agents☆17Oct 26, 2017Updated 8 years ago
- Apache Nutch extensions☆34Mar 21, 2022Updated 3 years ago
- fetchIO is a simple, configurable, fault-tolerant web crawler written in Haskell☆23Feb 16, 2017Updated 8 years ago
- A library for financial and time series calculations on Apache Spark☆28Feb 2, 2016Updated 10 years ago
- A Data Mesh demo repository☆13Oct 10, 2024Updated last year
- Cloud Mining automatically builds exploratory faceted search systems.☆52Oct 15, 2013Updated 12 years ago
- Structured Data Extractor. An application to extract structured data from web pages. It uses Data Extraction Based on Partial Tree Alignm…☆49Jun 9, 2012Updated 13 years ago
- Modularly extensible semantic metadata validator☆85Dec 10, 2015Updated 10 years ago
- Focused Crawler for VT's CTRNet☆10May 13, 2013Updated 12 years ago
- Human resource managment system implemented with filament php.☆13Dec 28, 2022Updated 3 years ago
- A generic interface wrapping multiple backends to provide a consistent pubsub API☆13Oct 31, 2018Updated 7 years ago
- TravianxT4☆14Sep 22, 2011Updated 14 years ago
- A self-contained morphological analyzer (including dictionary data).☆33Jul 30, 2015Updated 10 years ago
- Digitization information system build on top of Fedora repository☆16Jan 15, 2019Updated 7 years ago
- generate custom supreme box logos☆13Nov 28, 2017Updated 8 years ago
- Bicycle Incident reporting☆13Jul 22, 2022Updated 3 years ago
- Simple CSS based rating plugin☆17Dec 4, 2014Updated 11 years ago
- Islandora Solr Search module☆24Jul 28, 2025Updated 6 months ago
- Entity Linking for the masses☆56Nov 10, 2015Updated 10 years ago
- ☆12Oct 25, 2015Updated 10 years ago
- Flask app for monitoring OEE☆11Sep 25, 2023Updated 2 years ago
- PHP HandlerSocket plugin for MySQL Improved Extension☆21Jun 26, 2020Updated 5 years ago
- Performs multi document summarization. Includes a method to generate summaries: The method uses a sentence importance score calculator ba…☆38Apr 7, 2013Updated 12 years ago
- An open-source news aggregator☆15Sep 9, 2016Updated 9 years ago
- Web crawler☆42May 14, 2016Updated 9 years ago
- Standalone version of Chrome's about:gpu profiling tool☆22Aug 26, 2020Updated 5 years ago
- A minimalistic PvP game server☆11Jul 28, 2021Updated 4 years ago
- ☆12Apr 7, 2015Updated 10 years ago
- CROMER (CROss-document Main Events and entities Recognition), is a tool for cross-document coreference☆12Jan 14, 2015Updated 11 years ago
- Distributed Web Crawler, Parser and Search Engine.☆10Jun 16, 2016Updated 9 years ago