McGill-NLP/medal

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/McGill-NLP/medal)

McGill-NLP / medal

Large medical text dataset curated for abbreviation disambiguation, designed for natural language understanding pre-training in the medical domain

☆284

Alternatives and similar repositories for medal

Users that are interested in medal are comparing it to the libraries listed below

Sorting:

jianlins / PyRuSH
View on GitHub
init
☆13Dec 4, 2024Updated last year
chanzuckerberg / MedMentions
View on GitHub
A corpus of Biomedical papers annotated with mentions of UMLS entities.
☆344Nov 9, 2021Updated 4 years ago
abachaa / Existing-Medical-QA-Datasets
View on GitHub
Multimodal Question Answering in the Medical Domain: A summary of Existing Datasets and Systems
☆315Oct 17, 2023Updated 2 years ago
ncbi-nlp / bluebert
View on GitHub
BlueBERT, pre-trained on PubMed abstracts and clinical notes (MIMIC-III).
☆588Mar 25, 2023Updated 2 years ago
luoyuanlab / Clinical-Longformer
View on GitHub
☆63Jul 4, 2023Updated 2 years ago
hibaahsan / MIMIC-SBDH
View on GitHub
Dataset containing 7,025 discharge summary notes from the MIMIC III dataset annotated for 7 SBDHs
☆19Jun 18, 2022Updated 3 years ago
CogStack / MedCAT
View on GitHub
Medical Concept Annotation Tool
☆523Jul 25, 2025Updated 7 months ago
bigscience-workshop / biomedical
View on GitHub
Tools for curating biomedical training data for large-scale language modeling
☆493Dec 9, 2024Updated last year
uf-hobi-informatics-lab / ClinicalTransformerNER
View on GitHub
a library for named entity recognition developed by UF HOBI NLP lab featuring SOTA algorithms
☆156Sep 13, 2023Updated 2 years ago
medspacy / sectionizer
View on GitHub
A rule-based Python module for spitting documents into sections.
☆12Nov 14, 2020Updated 5 years ago
panushri25 / emrQA
View on GitHub
Code for the emrQA question answering dataset
☆153Feb 9, 2022Updated 4 years ago
NachusS / Snomed2Vec
View on GitHub
New approach to use Snomed-CT Concept using Word Embedding with Word2vec
☆21Feb 27, 2019Updated 7 years ago
kormilitzin / med7
View on GitHub
☆221Dec 11, 2024Updated last year
ncbi-nlp / BLUE_Benchmark
View on GitHub
BLUE benchmark consists of five different biomedicine text-mining tasks with ten corpora.
☆296Jan 12, 2022Updated 4 years ago
medspacy / medspacy
View on GitHub
Library for clinical NLP with spaCy.
☆634Aug 4, 2025Updated 7 months ago
neulab / InterpretEval
View on GitHub
Interpretable Evaluation for (Almost) All NLP Tasks
☆195Sep 22, 2025Updated 5 months ago
gmichalo / UmlsBERT
View on GitHub
☆101Feb 25, 2022Updated 4 years ago
facebookresearch / bio-lm
View on GitHub
We evaluate many models used for biomedical and clinical nlp tasks, and train new models that perform much better.
☆163Jul 29, 2021Updated 4 years ago
seonhee99 / EHR-SeqSQL
View on GitHub
Official repository of "EHR-SeqSQL : A Sequential Text-to-SQL Dataset For Interactively Exploring Electronic Health Records" (ACL 2024 Fi…
☆17Jul 5, 2024Updated last year
JohnGiorgi / DeCLUTR
View on GitHub
The corresponding code from our paper "DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations". Do not hesitate to o…
☆378Apr 21, 2023Updated 2 years ago
simonlevine / clinical-longformer
View on GitHub
☆38Mar 27, 2022Updated 3 years ago
nlpie-research / Compact-Biomedical-Transformers
View on GitHub
This repository contains the code used for distillation and fine-tuning of compact biomedical transformers that have been introduced in t…
☆19Mar 26, 2024Updated last year
babylonhealth / EHR-Rel
View on GitHub
Biomedical concept relatedness benchmark sampled from electronic health records
☆11Jul 14, 2022Updated 3 years ago
MeteSertkan / ranger
View on GitHub
Ranger helps you see the forest among the trees - Ranger is an effect-size meta analysis library creating beautiful forest plots!
☆11Jun 12, 2023Updated 2 years ago
Georgetown-IR-Lab / QuickUMLS
View on GitHub
System for Medical Concept Extraction and Linking
☆435Aug 12, 2024Updated last year
xiangyue9607 / CliniRC
View on GitHub
Code for the paper "Clinical Reading Comprehension: A Thorough Analysis of the emrQA Dataset" (ACL 2020)
☆17May 9, 2020Updated 5 years ago
abhilash1910 / ClusterTransformer
View on GitHub
Topic clustering library built on Transformer embeddings and cosine similarity metrics.Compatible with all BERT base transformers from hu…
☆44Jun 11, 2021Updated 4 years ago
lisavirginia / clinical-abbreviations
View on GitHub
☆67Feb 27, 2021Updated 5 years ago
asahi417 / tner
View on GitHub
Language model fine-tuning on NER with an easy interface and cross-domain evaluation. "T-NER: An All-Round Python Library for Transformer…
☆396May 11, 2023Updated 2 years ago
nedap / deidentify
View on GitHub
A Python library to de-identify medical records with state-of-the-art NLP methods.
☆142Nov 17, 2025Updated 3 months ago
timoschick / bertram
View on GitHub
This repository contains the code for "BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Representations".
☆64Aug 13, 2020Updated 5 years ago
clinicalml / omop-learn
View on GitHub
Python package for machine learning for healthcare using a OMOP common data model
☆111Jun 17, 2023Updated 2 years ago
OHNLP / MedTagger
View on GitHub
MedTagger is a light weight clinical NLP system built upon Apache UIMA.
☆71May 5, 2025Updated 10 months ago
uf-hobi-informatics-lab / NLPreprocessing
View on GitHub
A comprehensive NLP preprocessing package for clinical notes sentence boundary detection, tokenization
☆32May 22, 2024Updated last year
allenai / BEEP
View on GitHub
Code repository for BEEP (Biomedical Evidence Enhanced Predictions) clinical outcome prediction system
☆26Nov 8, 2023Updated 2 years ago
dmis-lab / BioLAMA
View on GitHub
EMNLP'2021: Can Language Models be Biomedical Knowledge Bases?
☆57Mar 9, 2023Updated 3 years ago
abachaa / MedQuAD
View on GitHub
Medical Question Answering Dataset of 47,457 QA pairs created from 12 NIH websites
☆441Oct 17, 2023Updated 2 years ago
helboukkouri / character-bert
View on GitHub
Main repository for "CharacterBERT: Reconciling ELMo and BERT for Word-Level Open-Vocabulary Representations From Characters"
☆199Oct 3, 2023Updated 2 years ago
dermatologist / omopfhirmap
View on GitHub
OMOP <-> FHIR mapper
☆11Mar 6, 2023Updated 3 years ago