junhua / IPOD
A Corpus of 475,000 Industrial Occupations
☆66Updated 4 years ago
Alternatives and similar repositories for IPOD:
Users that are interested in IPOD are comparing it to the libraries listed below
- The dataset used to evaluate JobBERT on the task of job title normalization.☆26Updated 2 years ago
- Nesta's Skills Extractor Library☆129Updated 5 months ago
- A Python library aimed at dissecting and augmenting NER training data.☆58Updated last year
- Explainable Zero-Shot Topic Extraction☆62Updated 7 months ago
- A monolingual and cross-lingual meta-embedding generation and evaluation framework☆80Updated 2 years ago
- This repository contains an easy and intuitive approach to few-shot NER using most similar expansion over spaCy embeddings. Now with enti…☆245Updated last year
- Code and Dataset for the Bhola et al. (2020) Retrieving Skills from Job Descriptions: A Language Model Based Extreme Multi-label Classifi…☆53Updated 3 years ago
- Fine-tuning a Hugging Face BERT model for the United Nations Named Entity Recognition task.☆33Updated 3 years ago
- SKILLSPAN: Competences as Spans for Skill Extraction from Job Postings☆60Updated last month
- Entity Disambiguation as text extraction (ACL 2022)☆181Updated 2 years ago
- Augmenty is an augmentation library based on spaCy for augmenting texts.☆153Updated 10 months ago
- Creating class-based TF-IDF matrices☆83Updated 2 years ago
- Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality …☆106Updated last year
- Implementation, trained models and result data for the paper "Aspect-based Document Similarity for Research Papers" #COLING2020☆62Updated 11 months ago
- A spaCy wrapper for DBpedia Spotlight☆109Updated 2 years ago
- A library to synthesize text datasets using Large Language Models (LLM)☆151Updated 2 years ago
- Information extraction from English and German texts based on predicate logic☆135Updated last year
- Information extraction pipeline containing coreference resolution, named entity linking, and relationship extraction☆81Updated 4 years ago
- Vespa application making an index of the CORD-19 dataset.☆39Updated 2 months ago
- [LREC 2022] An off-the-shelf pre-trained Tweet NLP Toolkit (NER, tokenization, lemmatization, POS tagging, dependency parsing) + Tweeban…☆104Updated last year
- Visualise, evaluate, and manage annotated data☆33Updated 2 years ago
- Sentence transformers models for SpaCy☆107Updated 2 years ago
- This repository contains materials for the SIGIR 2022 tutorial on opinion summarization.☆34Updated 2 years ago
- Examples for aligning, padding and batching sequence labeling data (NER) for use with pre-trained transformer models☆65Updated 2 years ago
- A multi-lingual approach to AllenNLP CoReference Resolution along with a wrapper for spaCy.☆106Updated 11 months ago
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.☆108Updated 10 months ago
- Few-shot Named Entity Recognition☆123Updated 3 years ago
- Hashformers is a framework for hashtag segmentation with Transformers and Large Language Models (LLMs).☆70Updated 7 months ago
- Running Prodigy for a team of annotators☆53Updated 4 years ago
- ☆61Updated 4 years ago