Nutch-Python is a Python binding to the Apache Nutch™ REST services allowing Nutch to be called natively in the Python community. — Edit
☆39Apr 15, 2016Updated 9 years ago
Alternatives and similar repositories for nutch-python
Users that are interested in nutch-python are comparing it to the libraries listed below
Sorting:
- For interacting with nutch via Python☆29Feb 18, 2026Updated last week
- Tika-Similarity uses the Tika-Python package (Python port of Apache Tika) to compute file similarity based on Metadata features.☆108Apr 9, 2025Updated 10 months ago
- Combines Apache OpenNLP and Apache Tika and provides facilities for automatically deriving sentiment from text.☆34May 3, 2023Updated 2 years ago
- Vizlinc☆15Jan 14, 2016Updated 10 years ago
- Stanford CoreNLP NER addon for Apache Tika's NamerEntityParser☆13Feb 26, 2022Updated 4 years ago
- ☆44Jan 15, 2016Updated 10 years ago
- Uses Apache Lucene, OpenNLP and geonames and extracts locations from text and geocodes them.☆38Apr 9, 2024Updated last year
- Scraper built with Scrapy.☆18Aug 14, 2024Updated last year
- This is the ETL lib package. It provides an API to munge and prepare JSON, TSV and other data using Apache Tika and JSON parsing/loading …☆18Jan 27, 2024Updated 2 years ago
- ☆14Dec 24, 2016Updated 9 years ago
- Identifying and Analyzing Researchers on Twitter☆18Aug 9, 2017Updated 8 years ago
- Tools for scraping of twitter data, conversion, text analysis and graph construction☆11Aug 1, 2016Updated 9 years ago
- Pattern-of-Behavior Search Tool☆11Jun 20, 2022Updated 3 years ago
- A dataset downloaded from the deep and scientific web across three major Polar data centers for use in research.☆13Sep 8, 2017Updated 8 years ago
- General Architecture for Text Engineering☆49Mar 23, 2016Updated 9 years ago
- Polar USC activities related to NSF Polar CyberInfrastructure program at the University of Southern California☆15Jan 15, 2023Updated 3 years ago
- Simple RESTful API server running your own machine translation model. Docker image modified from mbartoli/easy-smt☆11Apr 28, 2019Updated 6 years ago
- Training activities for the Arctic Data Center☆10Dec 6, 2022Updated 3 years ago
- Front-end for the MediaCloud database☆16Apr 3, 2018Updated 7 years ago
- MEMEX Weapons Pilot for the illegal weapons domain.☆15May 20, 2016Updated 9 years ago
- Trending on Accumulo☆40Oct 3, 2012Updated 13 years ago
- For extracting measurements and related entities from text☆58May 6, 2020Updated 5 years ago
- Viewers for statistics and dashboarding of Domain Search Engine data☆126Jan 19, 2016Updated 10 years ago
- A toolkit for clustering web pages based on various similarity measures.☆34Oct 27, 2021Updated 4 years ago
- Next generation graph processing platform☆12Aug 26, 2016Updated 9 years ago
- Elwha is a Java application for monitoring topics, sentiment and events on Twitter streams with the ability to generate notification mess…☆17Sep 11, 2015Updated 10 years ago
- The User Activity Logging Engine, or User-ALE, is a logging mechanism used to quantitatively assess the behavioural and cognitive state o…☆13Aug 26, 2016Updated 9 years ago
- ☆25Jan 26, 2016Updated 10 years ago
- Sensefy is a federated enterprise semantic search framework built on Apache ManifoldCF, Apache Solr and Apache Stanbol. Development is sp…☆15Jul 11, 2022Updated 3 years ago
- Building an API with the FastAPI framework to serve a scikit-learn model.☆18Jan 13, 2019Updated 7 years ago
- Meta information for the DARPA open catalog project.☆56Nov 16, 2017Updated 8 years ago
- A simple API for CouchSurfing.org☆31Dec 19, 2016Updated 9 years ago
- Diachronic text analysis in Python☆27May 28, 2020Updated 5 years ago
- Seed acquisition tool to bootstrap focused crawlers☆23Apr 24, 2017Updated 8 years ago
- Introduction to TensorFlow. Basic operators, linear and logistic regression and Tensorboard☆22Mar 7, 2017Updated 8 years ago
- ☆20Nov 1, 2017Updated 8 years ago
- stav text annotation visualiser☆34Nov 2, 2011Updated 14 years ago
- Convert URL's to a normalized unicode format☆67Feb 8, 2018Updated 8 years ago
- OZONE Widget Framework☆329Jan 24, 2018Updated 8 years ago