osdg-ai / osdg-dataLinks
The OSDG Community Dataset (OSDG-CD) is a public dataset of thousands of text excerpts, validated by OSDG Community Platform (OSDG-CP) citizen scientists with respect to the Sustainable Development Goals (SDGs). The dataset is updated every quarter and published on Zenodo.
☆36Updated last year
Alternatives and similar repositories for osdg-data
Users that are interested in osdg-data are comparing it to the libraries listed below
Sorting:
- OSDG is an open-source tool that maps and connects activities to the UN Sustainable Development Goals (SDGs) by identifying SDG-relevant …☆43Updated 2 years ago
- A Python client for the GDELT 2.0 Doc API☆148Updated 4 months ago
- ☆16Updated 4 years ago
- Text analysis with networks.☆288Updated 5 months ago
- Fine-tuning a Hugging Face BERT model for the United Nations Named Entity Recognition task.☆34Updated 4 years ago
- A Flexible Deep Learning Approach to Fuzzy String Matching☆147Updated 10 months ago
- A very simple library for exploiting graph-of-words in NLP☆12Updated 4 years ago
- Code for the Master Thesis "Enhancing the Microsoft Academic Knowledge Graph"☆14Updated 4 years ago
- Code for the CUP Elements on text analysis in Python for social scientists☆137Updated 2 years ago
- Full text geoparsing/toponym resolution with event geolocation☆78Updated last week
- A deep learning system for demographic inference (gender, age, and individual/person) that was trained on massive Twitter dataset using p…☆151Updated 2 years ago
- Interpretable data visualizations for understanding how texts differ at the word level☆280Updated 6 months ago
- Text and statistics utilities from Pew Research Center☆86Updated 3 years ago
- This package consists of functionalities for dynamic topic modelling and its visualization☆26Updated 5 years ago
- Data and code accompanying the Nature paper "Quantifying social organization and political polarization in online platforms"☆64Updated 3 years ago
- Python based framework to retreive Global Database of Events, Language, and Tone (GDELT) version 1.0 and version 2.0 data.☆231Updated last year
- Pretrained BERT model for analysing COVID-19 Twitter data☆184Updated 2 years ago
- Code and experiments for *BERTopic: Neural topic modeling with a class-based TF-IDF procedure*☆81Updated last year
- A Python wrapper around the topic modeling functions of MALLET.☆103Updated 10 months ago
- Using the Gmail API to topic model my recommended Medium reads☆24Updated 3 years ago
- How are words loaded with meaning? Repository accompanying research by Alina Arseniev-Koehler and Jacob G. Foster, titled "Machine learn…☆41Updated 2 years ago
- TweetNLP for all the NLP enthusiasts working on Twitter! The Python library tweetnlp provides a collection of useful tools to analyze/und…☆357Updated 5 months ago
- The FBAdLibrarian is a simple tool that can pull ad data and collects images offered by Facebook’s Ad Library API.☆16Updated 2 years ago
- Wellcome tool to parse references scraped from policy documents using machine learning☆25Updated 4 years ago
- The project proposes a framework to apply topic models on a text-corpus and eventually topic labels on the generated topics.☆35Updated last year
- A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidata☆164Updated 2 years ago
- Analysis and experiments on the UN General Debate corpus☆36Updated 6 years ago
- Tutorial for using twarc, with steps for installing software.☆25Updated 7 years ago
- Nesta's Skills Extractor Library☆141Updated 2 months ago
- Fuzzy matches and merging of datasets in pandas using csvmatch☆74Updated 5 years ago