mapio / py-web-graph
A simple package allowing to use WebGraph data in Python (via the Jython interpreter).
☆17Updated 4 years ago
Alternatives and similar repositories for py-web-graph:
Users that are interested in py-web-graph are comparing it to the libraries listed below
- A GPU-acceleration of the graph database Neo4j☆34Updated 7 years ago
- Webgraph++ code (http://cnets.indiana.edu/groups/nan/webgraph/)☆30Updated 5 months ago
- A Generalized Data Cleaning System☆49Updated 8 years ago
- Extract statistics from Wikipedia Dump files.☆26Updated 3 years ago
- ☆42Updated last year
- A Python wrapper over the GraphGen system☆37Updated 7 years ago
- Linking Entities in CommonCrawl Dataset onto Wikipedia Concepts☆59Updated 12 years ago
- For interacting with nutch via Python☆24Updated this week
- ☆29Updated 7 years ago
- KnowledgeStore☆20Updated 6 years ago
- This repository contains the Domain Discovery Tool (DDT) project. DDT is an interactive system that helps users explore and better unders…☆46Updated 3 years ago
- Weighted MinHash implementation on CUDA (multi-gpu).☆116Updated last year
- 🌐 Netbase : Semantic Graph Database & Wikidata Server☆8Updated last year
- A project for clustering text streams using locality-sensitive hashing (LSH) in Python☆27Updated 13 years ago
- A distributed system for mining common crawl using SQS, AWS-EC2 and S3☆16Updated 10 years ago
- Deployment of pywb as a CommonCrawl Index Server☆21Updated 7 years ago
- Algorithms for "schema matching"☆25Updated 8 years ago
- Scalable Graph Mining☆61Updated 2 years ago
- The STINGER in-memory graph store and dynamic graph analysis platform. Millions to billions of vertices and edges at thousands to millio…☆11Updated 9 years ago
- Json Wikipedia, contains code to convert the Wikipedia xml dump into a json dump. Questions? https://gitter.im/idio-opensource/Lobby☆17Updated 2 years ago
- Graph Challenge☆31Updated 5 years ago
- CuSha is a CUDA-based vertex-centric graph processing framework that uses G-Shards and CW representations.☆52Updated 9 years ago
- Alchemist: an Apache Spark<->MPI interface☆26Updated 6 years ago
- Pipeline for distributed Natural Language Processing, made in Python☆65Updated 7 years ago
- Raw Wikipedia counts for entity linking☆19Updated 7 years ago
- A repository for the "Combining DBpedia and Topic Modeling" GSoC 2016 idea☆13Updated 8 years ago
- WebGraph is a framework for graph compression.☆54Updated 3 months ago
- Assessing Source Code Semantic Similarity with Unsupervised Learning☆41Updated 6 years ago
- GraphMineSuite (GMS): a benchmarking suite for graph mining algorithms such as graph pattern matching or graph learning☆25Updated 3 years ago