Apache Nutch extensions
☆34Mar 21, 2022Updated 3 years ago
Alternatives and similar repositories for nutch-plugins
Users that are interested in nutch-plugins are comparing it to the libraries listed below
Sorting:
- Distributed Web Crawler, Parser and Search Engine.☆10Jun 16, 2016Updated 9 years ago
- ☆66Dec 11, 2016Updated 9 years ago
- Code for KDD 2014 paper "Mining Topics in Documents: Standing on the Shoulders of Big Data"☆21Oct 6, 2015Updated 10 years ago
- common data interchange format for document processing pipelines that apply natural language processing tools to large streams of text☆35Sep 30, 2016Updated 9 years ago
- knyfe is a python utility for rapid exploration of datasets.☆54Apr 3, 2015Updated 10 years ago
- export snapshot to S3 and retore to workspace☆11May 7, 2022Updated 3 years ago
- Implementation of an algorithm computing the nearest "N" neighbours to a vector, using a collection of hyperplane hashers.☆30Jul 17, 2015Updated 10 years ago
- ☆55Jan 10, 2020Updated 6 years ago
- Boosting and ensemble learning in Python.☆54Apr 6, 2015Updated 10 years ago
- Cloud Mining automatically builds exploratory faceted search systems.☆52Oct 15, 2013Updated 12 years ago
- The goal of this experiment is to take articles and certain metadata and group them by topic.☆11Apr 14, 2016Updated 9 years ago
- Experimenting with GANs in Tensorflow/Keras☆10Jan 13, 2022Updated 4 years ago
- Simple, beautiful discussion forums - for customer support, news aggregation, QA sites, and online communities.☆56Dec 9, 2012Updated 13 years ago
- Chess chessboard and rules made in jQuery☆15Jul 24, 2013Updated 12 years ago
- ☆12Oct 25, 2015Updated 10 years ago
- A generic interface wrapping multiple backends to provide a consistent pubsub API☆13Oct 31, 2018Updated 7 years ago
- Queueable interfaces - Unleash the async power!☆12Nov 3, 2014Updated 11 years ago
- A responsive & browser compatible video player☆53Apr 18, 2017Updated 8 years ago
- Focused Crawler for VT's CTRNet☆10May 13, 2013Updated 12 years ago
- Green SqlAlchemy extensions for pulsar☆11Nov 24, 2017Updated 8 years ago
- My dotfiles☆12Feb 9, 2026Updated 3 weeks ago
- Bicycle Incident reporting☆13Jul 22, 2022Updated 3 years ago
- Digitization information system build on top of Fedora repository☆16Jan 15, 2019Updated 7 years ago
- An open-source news aggregator☆15Sep 9, 2016Updated 9 years ago
- Software for unsupervised word segmentation and language model learning using lattices☆45Aug 17, 2016Updated 9 years ago
- Latent dirichlet allocation (LDA) for datamicroscopes☆41Oct 16, 2015Updated 10 years ago
- sparql-stream sensor queries☆16Sep 28, 2016Updated 9 years ago
- Simple, open source utility to convert CSV/TSV files to RDF☆14Aug 6, 2014Updated 11 years ago
- Scala port of the word2vec toolkit.☆11Aug 15, 2016Updated 9 years ago
- Spring integration with Stardog RDF database☆18Jan 27, 2025Updated last year
- RDFSpace constructs a vector space from any RDF dataset which can be used for computing similarities between resources in that dataset.☆41Nov 8, 2013Updated 12 years ago
- Stream torrents to VLC using Peerflix and torrent using your terminal☆10Feb 15, 2018Updated 8 years ago
- Loopback web application for administration of Datawake networks☆10May 2, 2017Updated 8 years ago
- Implementation of the MMDAgent for use as a live receptionist in Carnegie Mellon's School of Computer Science.☆15Apr 11, 2013Updated 12 years ago
- A tool to get pretty girls images in your command line☆28Aug 20, 2014Updated 11 years ago
- Taws - A personal and private web search engine☆24Feb 20, 2015Updated 11 years ago
- Code for Max-Margin Deep Generative Models☆12Jan 1, 2015Updated 11 years ago
- Stream Processing ToolKit☆18Aug 14, 2015Updated 10 years ago
- Search over RDF schemas and OWL ontologies☆11Sep 28, 2013Updated 12 years ago