☆67Dec 11, 2016Updated 9 years ago
Alternatives and similar repositories for nutch-custom-search
Users that are interested in nutch-custom-search are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A Nutch 2.2.1 plugin which allows users to shuffle off the responsibility for retrieving pages to a selenium hub/node spoke system. This …☆16Jun 9, 2016Updated 9 years ago
- FoGFaaS: Add serverless computing (faas) to ifogsim☆22Mar 30, 2025Updated last year
- Apache Nutch extensions☆34Mar 21, 2022Updated 4 years ago
- Simple search results with Solr and EmberJS☆58Mar 5, 2019Updated 7 years ago
- ☆28Jun 9, 2016Updated 9 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Crawl-Anywhere - Web Crawler and document processing pipeline with Solr integration.☆98Jul 1, 2017Updated 8 years ago
- High Performance Marmotta Backend implementation in C++ (using gRPC and LevelDB)☆16Dec 12, 2015Updated 10 years ago
- XBlock to use SCORM content in Open edX. Main development in use_ssla_player branch, requires commercial SSLA player by JCA Solutions.☆12Jun 21, 2023Updated 2 years ago
- ☆25Apr 6, 2015Updated 11 years ago
- Collects multimedia content shared through social networks.☆19Feb 18, 2015Updated 11 years ago
- A semantic web crawler☆20Sep 20, 2010Updated 15 years ago
- SKOS Support for Apache Lucene and Solr☆56May 12, 2021Updated 5 years ago
- Simplified scalable aggregation and processing framework built upon Apache Camel.☆22Sep 15, 2018Updated 7 years ago
- Kairos, combines a focused crawler and an information extraction engine, to convert a list of conference websites into a index filled wit…☆19Feb 20, 2011Updated 15 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Implicit relation extractor using a natural language model.☆24May 25, 2018Updated 7 years ago
- ☆20May 11, 2026Updated last week
- Provides PDF Preview for LilyPond-generated PDFs. Supports point-and-click from PDF to source code.☆10May 8, 2023Updated 3 years ago
- Boilerplate express application for building telegram bots☆12Jun 5, 2018Updated 7 years ago
- Noviat apps.odoo.com repository☆20Mar 5, 2021Updated 5 years ago
- The first Open Source document analysis platform☆65Aug 2, 2021Updated 4 years ago
- Code repository for R Data Mining Blueprints, published by Packt☆10Jan 14, 2021Updated 5 years ago
- Arduino Synthesizer Sampler☆16Mar 21, 2026Updated last month
- ☆11Nov 16, 2019Updated 6 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A Text Classification API in Java originally developed by DigitalPebble Ltd. The API is independent from the ML implementations used and …☆48Sep 24, 2021Updated 4 years ago
- Parses Solr's log file to get some basic query statistics☆20Nov 14, 2018Updated 7 years ago
- Utility to translate NIF files across identifier schemes, such as DBpedia and Wikidata☆11Aug 24, 2019Updated 6 years ago
- Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.☆423Mar 30, 2023Updated 3 years ago
- Wowza Media Server modules☆13Nov 16, 2014Updated 11 years ago
- WordNet to neo4j 2.2☆12Nov 6, 2015Updated 10 years ago
- Resources for KGC: languages/tools/evaluation-systems☆15Oct 13, 2020Updated 5 years ago
- Behemoth is an open source platform for large scale document analysis based on Apache Hadoop.☆284Apr 25, 2018Updated 8 years ago
- Document clustering based on Latent Semantic Analysis☆96Apr 29, 2010Updated 16 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Apache Camel, Kafka Component☆21Jul 7, 2014Updated 11 years ago
- DEPRECATED (see README) DKAN Dataset - Provides the ability to create and publish datasets in a DCAT compatible format: http://www.w3.org…☆15Mar 15, 2017Updated 9 years ago
- A nerd's boilerplate for your Python project.☆18Oct 15, 2020Updated 5 years ago
- December 14th Python Meetup Files☆40Mar 2, 2013Updated 13 years ago
- Neural Machine Translation project for NLP Fall 2016☆10Dec 20, 2016Updated 9 years ago
- Islandora Solr Search module☆24Jul 28, 2025Updated 9 months ago
- ProceXSS is an Asp.NET Http module -tries- to prevent to xss attacks.☆13Sep 9, 2018Updated 7 years ago