osdg-ai / osdg-data
The OSDG Community Dataset (OSDG-CD) is a public dataset of thousands of text excerpts, validated by OSDG Community Platform (OSDG-CP) citizen scientists with respect to the Sustainable Development Goals (SDGs). The dataset is updated every quarter and published on Zenodo.
☆30Updated last year
Alternatives and similar repositories for osdg-data:
Users that are interested in osdg-data are comparing it to the libraries listed below
- OSDG is an open-source tool that maps and connects activities to the UN Sustainable Development Goals (SDGs) by identifying SDG-relevant …☆38Updated 2 years ago
- Code for the Master Thesis "Enhancing the Microsoft Academic Knowledge Graph"☆14Updated 4 years ago
- Using the Gmail API to topic model my recommended Medium reads☆24Updated 3 years ago
- Full text geoparsing/toponym resolution with event geolocation☆74Updated last month
- ☆16Updated 4 years ago
- Semantic Scholar's Author Disambiguation Algorithm & Evaluation Suite☆90Updated last year
- A tool for Semantic Scaling of Political Text (branch of Topfish, a suite of tools for Political Text Analysis)☆27Updated last year
- A light-weight wrapper for the Datawrapper API.☆63Updated 8 months ago
- ✨ Awesome - A curated list of amazing Topic Models (implementations, libraries, and resources)☆93Updated 2 years ago
- Python package for text mining of time-series data☆71Updated 3 months ago
- Blazing fast topic modelling for short texts.☆31Updated 2 months ago
- A python package to enrich Twitter Data☆75Updated last year
- CSV inspection☆46Updated 3 weeks ago
- Fine-tuning a Hugging Face BERT model for the United Nations Named Entity Recognition task.☆33Updated 3 years ago
- A text processing pipeline for turning unstructured text data into hierarchical datasets☆14Updated 4 years ago
- A tool to assign Sustainable Development Goals to a scientific abstract☆16Updated 4 years ago
- Building a model to recognize incentives for landscape restoration in environmental policies from Latin America, the US and India. Bringi…☆35Updated 3 years ago
- Helpers for our open data☆7Updated 4 months ago
- Extract networks of entities from journalistic reporting☆48Updated last year
- ☆54Updated last year
- DashMap is an open source web platform that gathers, analyses and visualises urban data.☆45Updated 2 years ago
- Repository hosting the large language model EconBERTa and the annotated dataset EconIE☆13Updated 6 months ago
- How are words loaded with meaning? Repository accompanying research by Alina Arseniev-Koehler and Jacob G. Foster, titled "Machine learn…☆41Updated last year
- A novel R package that can identify and visualize 17 Sustainable Development Goals and associated 169 Targets in text☆15Updated 6 months ago
- A very simple library for exploiting graph-of-words in NLP☆12Updated 3 years ago
- A list of GDELT themes that taken together broadly represent "issues" and media source lists, a way to split GDELT sources into more conc…☆20Updated 5 years ago
- Tutorial for using twarc, with steps for installing software.☆25Updated 7 years ago
- Text analysis with networks.☆285Updated this week
- Fast, flexible name matching for large datasets☆71Updated last year
- Easy PDF to text to spaCy text extraction in Python.☆39Updated 5 months ago