ContinuumIO / nutchpyLinks
For interacting with nutch via Python
☆29Updated last week
Alternatives and similar repositories for nutchpy
Users that are interested in nutchpy are comparing it to the libraries listed below
Sorting:
- Uses Apache Lucene, OpenNLP and geonames and extracts locations from text and geocodes them.☆38Updated last year
- Nutch-Python is a Python binding to the Apache Nutch™ REST services allowing Nutch to be called natively in the Python community. — Edit☆39Updated 9 years ago
- Mirror of Apache Stanbol (incubating)☆114Updated last year
- Spark implementation of the Google Correlate algorithm to quickly find highly correlated vectors in huge datasets☆92Updated 9 years ago
- Kira is an astronomy image processing toolkit implemented with Apache Spark.☆15Updated 9 years ago
- Topic modeling web application☆40Updated 10 years ago
- A Python library for learning from dimensionality reduction, supporting sparse and dense matrices.☆78Updated 8 years ago
- SociaLite: query language for large-scale graph analysis and data mining☆110Updated 9 years ago
- Solr Dictionary Annotator (Microservice for Spark)☆71Updated 5 years ago
- PyRDM is a Python-based library for research data management (RDM). It facilitates the automated publication of scientific software and a…☆32Updated 4 years ago
- A dataset downloaded from the deep and scientific web across three major Polar data centers for use in research.☆13Updated 8 years ago
- Scientific Spark - a NASA AIST14 project☆86Updated 7 years ago
- Framework for deploying Hadoop clusters on traditional HPC from userland☆45Updated 7 years ago
- Unified interface for local and distributed ndarrays☆157Updated 7 years ago
- A set of benchmark problems and implementations for Python☆65Updated 2 years ago
- A RESTful web service that runs microtasks across multiple crowds, provides quality control techniques, and is easily extensible.☆52Updated 8 years ago
- ☆21Updated 9 years ago
- scikit-learn addon to operate on set/"group"-based features☆41Updated 9 years ago
- Distributed Numpy☆148Updated 7 years ago
- A repo that contains outgoing links from DBpedia☆50Updated 5 years ago
- Warcbase is an open-source platform for managing analyzing web archives☆161Updated 7 years ago
- pelican-bibtex: Manage your academic publications page with Pelican and BibTeX☆52Updated 2 years ago
- People. Places. Things. Graphs.☆93Updated 11 years ago
- stav text annotation visualiser☆34Updated 14 years ago
- Pattern-of-Behavior Search Tool☆11Updated 3 years ago
- Vizlinc☆15Updated 9 years ago
- NER tagger for English, Spanish, Dutch, Italian and German and French.☆35Updated 10 years ago
- Alchemist: an Apache Spark<->MPI interface☆26Updated 7 years ago
- ☆16Updated 8 years ago
- ☆92Updated 10 years ago