brianckeegan / Wikipedia
Crawling and analyzing data on Wikipedia
☆16Updated 11 months ago
Alternatives and similar repositories for Wikipedia:
Users that are interested in Wikipedia are comparing it to the libraries listed below
- Python API for KB data-services☆19Updated 5 years ago
- A simple Web crawler for stackshare.io using scrapy .☆9Updated 5 years ago
- Google Refine extension for adding columns (extending data) from DBpedia☆39Updated 11 years ago
- ☆12Updated 5 years ago
- Entity linker for the newspaper collection of the National Library of the Netherlands. Links named entity mentions to DBpedia description…☆11Updated 2 years ago
- Processing OpenCitations Data☆17Updated 7 years ago
- modification of bibliotools 2.2 from Sébastian Grauwin☆11Updated 5 years ago
- Uses NLP methods to parse and classify contracts from The City of New Orleans☆10Updated 9 years ago
- Citation Style Language utilities☆18Updated 3 years ago
- Text Thresher crowd sourced text annotator☆17Updated 7 years ago
- A browser extension providing Open Access bibliographical services☆14Updated 2 years ago
- A repository for the "Combining DBpedia and Topic Modeling" GSoC 2016 idea☆13Updated 8 years ago
- Topic Modeling Workflow in Python☆16Updated 2 years ago
- A collection of ipython/jupyter notebooks☆16Updated 6 years ago
- A framework to allow the matching of string entities using customised sets of transformations and matchers, plus a tool to produce the ne…☆31Updated 7 years ago
- Plots various graphs for a series of plaintext files in a directory☆19Updated 8 years ago
- A PDFMiner wrapper to ease the text extraction from pdf files.☆25Updated 11 years ago
- bigram / trigram analysis of wikipedia; mainly mutual info☆22Updated 12 years ago
- Force-Atlas 2 graph layout in networkx☆22Updated 10 years ago
- The OpenSextant Gazetteer is a collection of world-wide place name data☆12Updated 7 years ago
- Convert a corpus of PDF to clean text files on a distributed architecture☆38Updated 11 months ago
- Code + Jupyter Notebooks for Visualizing Clusters of Clickbait Headlines Using Spark, Word2vec, and Plotly☆47Updated 4 years ago
- A system to generate SPARQL queries from natural language queries.☆30Updated this week
- A deep learning architecture for reference mining from literature in the arts and humanities.☆15Updated 5 years ago
- Scrapes citation statistics from Google Scholar☆61Updated last month
- Literate data analysis with iPython notebooks and Jekyll.☆92Updated 10 years ago
- Corpus Build OCR platform☆8Updated 2 years ago
- web app for parsing citations☆40Updated 15 years ago
- Service for creating Twitter datasets for research and archiving.☆26Updated 2 years ago
- A PyData 2013 talk on straightforward, data-driven ways to handle natural language text in Python.☆50Updated 10 years ago