chrismattmann/nutch-python

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/chrismattmann/nutch-python)

chrismattmann / nutch-python

Nutch-Python is a Python binding to the Apache Nutch™ REST services allowing Nutch to be called natively in the Python community. — Edit

☆39

Alternatives and similar repositories for nutch-python

Users that are interested in nutch-python are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

chrismattmann / tika-similarity
View on GitHub
Tika-Similarity uses the Tika-Python package (Python port of Apache Tika) to compute file similarity based on Metadata features.
☆108Jun 2, 2026Updated last month
mitll / topic-clustering
View on GitHub
☆44Jan 15, 2016Updated 10 years ago
ContinuumIO / scrapy_scrapers
View on GitHub
Scraper built with Scrapy.
☆18Jun 25, 2026Updated 3 weeks ago
nasa-jpl-memex / weapons
View on GitHub
MEMEX Weapons Pilot for the illegal weapons domain.
☆15May 20, 2016Updated 10 years ago
nasa-jpl-memex / topic_space
View on GitHub
Topic modeling web application
☆40Jul 23, 2015Updated 10 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
adidier17 / AuthorRank
View on GitHub
A modification of PageRank to find the most prestigious authors in a scientific collaboration network.
☆15Jul 6, 2023Updated 3 years ago
chrismattmann / lucene-geo-gazetteer
View on GitHub
Uses Apache Lucene, OpenNLP and geonames and extracts locations from text and geocodes them.
☆38Jun 5, 2026Updated last month
nasa-jpl-memex / memex-explorer
View on GitHub
Viewers for statistics and dashboarding of Domain Search Engine data
☆128Jan 19, 2016Updated 10 years ago
mitll / TweetE
View on GitHub
Tools for scraping of twitter data, conversion, text analysis and graph construction
☆11Aug 1, 2016Updated 9 years ago
mitll / graph-qube
View on GitHub
Pattern-of-Behavior Search Tool
☆11Jun 20, 2022Updated 4 years ago
chrismattmann / trec-dd-polar
View on GitHub
A dataset downloaded from the deep and scientific web across three major Polar data centers for use in research.
☆13Sep 8, 2017Updated 8 years ago
autonlab / ActiveSearch
View on GitHub
☆20Mar 31, 2017Updated 9 years ago
draperlaboratory / user-ale
View on GitHub
The User Activity Logging Engine, or User-ALE, is a logging mechanism used to quantitatively assess the behavioural and cognitive state o…
☆13Aug 26, 2016Updated 9 years ago
USCDataScience / autoextractor
View on GitHub
A toolkit for clustering web pages based on various similarity measures.
☆34Oct 27, 2021Updated 4 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
darpa-xdata / xlang
View on GitHub
☆21Jan 23, 2016Updated 10 years ago
smallk / smallk.github.io
View on GitHub
SmallK: very fast data clustering tools
☆13Apr 3, 2019Updated 7 years ago
ericwhyne / open-catalog-generator
View on GitHub
Code and templates required to build the DARPA open catalog.
☆18Mar 23, 2016Updated 10 years ago
snap-stanford / snap-dev
View on GitHub
SNAP repository for Ringo
☆15Jul 25, 2017Updated 8 years ago
mndrix / list_util
View on GitHub
Prolog list utility predicates
☆11Jul 19, 2018Updated 8 years ago
Aptima / pattern-matching
View on GitHub
Hadoop MapReduce over Hive based implementation of attributed network pattern matching.
☆40Sep 16, 2014Updated 11 years ago
USCDataScience / dl4j-kerasimport-examples
View on GitHub
This repository contains deeplearning4j examples for importing and making use of models trained in keras
☆27May 7, 2017Updated 9 years ago
kxtells / vague-places
View on GitHub
☆14Dec 24, 2016Updated 9 years ago
L3S / twitter-researcher
View on GitHub
Identifying and Analyzing Researchers on Twitter
☆18Aug 9, 2017Updated 8 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
TeamHG-Memex / docker-tor-rotator
View on GitHub
A rotating socks proxy using Tor, Delegate and Haproxy
☆14Apr 8, 2026Updated 3 months ago
Sotera / graphene
View on GitHub
☆25Jan 26, 2016Updated 10 years ago
kaneplusplus / flexmem
View on GitHub
☆17Dec 31, 2015Updated 10 years ago
mitll / MITIE
View on GitHub
MITIE: library and tools for information extraction
☆29Jan 22, 2015Updated 11 years ago
aglahe / vagrant-memex
View on GitHub
DARPA MEMEX project Vagrant VM
☆53Oct 17, 2016Updated 9 years ago
mille856 / CMU_memex
View on GitHub
☆20Nov 1, 2017Updated 8 years ago
Sotera / track-communities
View on GitHub
A series of analytics for creating networks from geo-temporal track data based on time/space co-occurrence. Includes UI for visualizatio…
☆14Aug 30, 2018Updated 7 years ago
usc-cloud / hadoop-louvain-community
View on GitHub
Map Reduce Implementation of a community detection algorithm extending Louvain method for community detection.
☆15Jan 13, 2016Updated 10 years ago
giantoak / MMPP
View on GitHub
R-implementation of a Markov-Modulated Poisson Process for unsupervised event detection.
☆15Dec 26, 2015Updated 10 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
jedisct1 / spark-kafka-docker
View on GitHub
Docker container to locally run Spark and Kafka
☆15Sep 5, 2016Updated 9 years ago
joshua-decoder / thrax
View on GitHub
Hadoop-based tool for extraction of large scale synchronous grammars for paraphrasing and machine translation
☆15Dec 2, 2016Updated 9 years ago
unchartedsoftware / aperturejs
View on GitHub
ApertureJS - an open, adaptable and extensible JavaScript visualization framework
☆56May 6, 2016Updated 10 years ago
KnowledgeCaptureAndDiscovery / wings
View on GitHub
Wings workflow system
☆51Aug 19, 2024Updated last year
mlj0381 / YmmRecommenderSystems
View on GitHub
运满满算法研究和数据开发
☆11Nov 13, 2017Updated 8 years ago
crockpotveggies / dl4j-examples
View on GitHub
Deeplearning4j Examples (DL4J, DL4J Spark, DataVec)
☆10Aug 16, 2018Updated 7 years ago
codemeta / codemeta-paper
View on GitHub
Codemeta paper.
☆10Jul 10, 2017Updated 9 years ago