fcibecchini / smart-crawler
A smart distributed crawler that infers navigation models of structured websites, used to cluster pages based on their structure and extract data from them.
☆8Updated 3 years ago
Alternatives and similar repositories for smart-crawler:
Users that are interested in smart-crawler are comparing it to the libraries listed below
- Python and Scala APIs for enhanced Spark analytics☆12Updated 7 years ago
- Text similarity based on Word2Vec vectors.☆11Updated 7 years ago
- A Java framework to build semantics-aware autoencoder neural network from a knowledge-graph.☆13Updated 7 years ago
- Code and Data Samples for Big Data Warehousing.☆10Updated 9 years ago
- Movielens collaborative filtering with Solr streaming expression☆11Updated 8 years ago
- Short Text Similarity as described in https://dl.acm.org/citation.cfm?id=2806475☆16Updated 5 years ago
- KnowledgeStore☆20Updated 6 years ago
- Extract statistics from Wikipedia Dump files.☆26Updated 3 years ago
- phData Pulse application log aggregation and monitoring☆13Updated 4 years ago
- Java library for Concrete, a data serialization format for NLP☆6Updated 5 years ago
- ☆16Updated 8 years ago
- Collects multimedia content shared through social networks.☆19Updated 9 years ago
- Real-time query spark and visualise it as graph.☆24Updated 7 years ago
- D3 and Play based visualization for entity-relation graphs, especially for NLP and information extraction☆29Updated 9 years ago
- A set of tools for performing Labeled Latent Dirichlet Allocation on textual datasets, with an emphasis on Twitter profiles. Contains too…☆42Updated 3 years ago
- Traptor -- A distributed Twitter feed☆26Updated 2 years ago
- Temporal_Graph_library☆25Updated 5 years ago
- Neural Elastic Inference and Search☆19Updated 5 years ago
- Javascript library to talk to multiple OLAP backends from multiple frontends☆18Updated 11 years ago
- iCQA - Intelligent Community Question Answering Framework☆32Updated 8 years ago
- A subgroup discovery tool that can use ontological domain knowledge (RDF graphs) in the learning process. Subgroup descriptions contain t…☆12Updated 7 years ago
- Implicit relation extractor using a natural language model.☆25Updated 6 years ago
- Big GeoSpatial Data Points Visualization Tool☆19Updated 8 years ago
- Code examples for Google Natural Language API.☆13Updated 5 years ago
- Simple FieldCache based query introspection Solr Search Component - solves the 'red sofa' problem☆12Updated 3 years ago
- Deep learning certificate part 1☆10Updated 2 years ago
- The first Open Source document analysis platform☆65Updated 3 years ago
- Provides the implementation of a topic detection framework developed for the MULTISENSOR project.☆9Updated 8 years ago
- ☆11Updated 9 years ago