LukasKriesch / CommonCrawlNewsDataSetLinks
This repository contains code to download, extract, filter and geocode news articles from the Common Crawl News Dataset
☆18Updated 5 months ago
Alternatives and similar repositories for CommonCrawlNewsDataSet
Users that are interested in CommonCrawlNewsDataSet are comparing it to the libraries listed below
Sorting:
- Samples of Entando applications☆12Updated 3 years ago
- ☆10Updated 9 years ago
- Archiving and transforming official Italian General Election text-only polls into machine readable data using Large Language Models☆16Updated this week
- RDF Community Discussions. Ask anything here!☆13Updated last year
- TellMeFirst is a tool for classifying and enriching textual documents via Linked Open Data.☆25Updated 3 years ago
- The 2nd consultation on eForms, the update to the EU's procurement standard forms. Scroll down for more information.☆20Updated 5 years ago
- Lehigh University Benchmark (LUBM).☆10Updated 5 years ago
- The Knowledge Graph of the Knowledge Graph Conference☆14Updated 2 years ago
- A repository to work on the transmodel ontology that provides support to the NeTEx model☆11Updated 4 years ago
- Linked SDMX☆17Updated 11 years ago
- A high-throughput ontology-based pipeline for data integration☆14Updated 2 years ago
- SPARQL-LD: A SPARQL Extension for Fetching and Querying Linked Data☆18Updated 2 years ago
- A systematic Benchmarking on the performance of Spark-SQL for processing Vast RDF datasets☆14Updated 3 years ago
- [0.9.9 Released] A high performance non-SPARQL based RDF data cube validator☆16Updated 9 years ago
- Reference XSLT-based implementation of GeoDCAT-AP☆17Updated last week
- RDF4J Documentation☆13Updated 5 years ago
- EduCOR: An Educational and Career-Oriented Recommendation Ontology☆12Updated 4 years ago
- Automatically exported from code.google.com/p/publishing-statistical-data☆13Updated 11 months ago
- Convert RDF data to relational databases☆18Updated 7 years ago
- A Java-based SPARQL query generator☆12Updated last year
- OWL API profile checker☆17Updated 8 years ago
- The Open Data Standards Directory is an iniative to provide an inventory of information regarding open data standards. This site is opera…☆28Updated 3 months ago
- ☆40Updated 7 years ago
- Loading OpenSanctions into Neo4J and Linkurious☆30Updated 10 months ago
- Python based Wikidata framework for easy dataframe extraction☆45Updated last year
- iServe is what we refer to as service warehouse which unifies service publication, analysis, and discovery through the use of lightweigh…☆24Updated 9 years ago
- Homebase of the IPTC EXTRA project about rule-based text categorization☆13Updated 8 years ago
- The Toxic Comment Classification project is an application that uses deep learning to identify toxic comments as toxic, severe toxic, obs…☆16Updated 2 years ago
- A step-by-step tutorial for publishing data and an ontology as Linked Data on your machine.☆14Updated 2 years ago
- Another RDF to diagram tool☆17Updated 4 years ago