hunterlang / weaksup-subset-selection
Subset selection / data pruning for weak supervision
☆14Updated last year
Related projects: ⓘ
- Dataset and evaluation suite enabling LLM instruction-following for scientific literature understanding.☆25Updated last month
- LitQA Eval: A difficult set of scientific questions that require context of full-text research papers to answer☆31Updated 2 months ago
- Biomedical Entity Linking Benchmark☆9Updated 3 months ago
- My heuristic script for sentence tokenization of mimic notes☆8Updated 6 years ago
- ☆53Updated last year
- Multimodal Transformers for biomedical text and Knowledge Graph data☆32Updated last year
- BioM-Transformers: Building Large Biomedical Language Models with BERT, ALBERT and ELECTRA☆33Updated 7 months ago
- Code for co-training large language models (e.g. T0) with smaller ones (e.g. BERT) to boost few-shot performance☆17Updated last year
- Bio relation extraction labeled dataset☆41Updated 2 years ago
- Python library for converting between BioNLP formats☆20Updated last year
- CascadER: Cross-Modal Cascading for Knowledge Graph Link Prediction (arXiv 22)☆13Updated 2 years ago
- Interpretable and efficient predictors using pre-trained language models. Scikit-learn compatible.☆37Updated 5 months ago
- Platform enabling Rapid Annotation for Clinical Entity Recognition☆49Updated 2 years ago
- A Python Natural Language Processing Toolkit for Medical Text Generation☆66Updated this week
- Code repository for BEEP (Biomedical Evidence Enhanced Predictions) clinical outcome prediction system☆24Updated 10 months ago
- EMNLP'2021: Can Language Models be Biomedical Knowledge Bases?☆54Updated last year
- Code for the paper SciCo: Hierarchical Cross-Document Coreference for Scientific Concepts (AKBC 2021). https://openreview.net/forum?id=OF…☆25Updated 2 years ago
- My own attempt at a long context genomics model, leveraging recent advances in long context attention modeling (Flash Attention + other h…☆51Updated last year
- Embedding Recycling for Language models☆38Updated last year
- For Med-Gemini, we relabeled the MedQA benchmark; this repo includes the annotations and analysis code.☆24Updated 3 months ago
- BioELECTRA☆51Updated 2 years ago
- ☆49Updated 3 years ago
- Biomedical Question Answering Datasets.☆71Updated last year
- Official Code Repo for the Paper: "How does This Interaction Affect Me? Interpretable Attribution for Feature Interactions", In NeurIPS 2…☆36Updated last year
- Biomedical Data-to-Text Generation via Fine-Tuning Transformers☆29Updated 2 years ago
- SMASHED is a toolkit designed to apply transformations to samples in datasets, such as fields extraction, tokenization, prompting, batchi…☆30Updated 3 months ago
- ☆19Updated 2 years ago
- ☆33Updated 3 years ago
- Bioformer: an efficient BERT model for biomedical text mining☆53Updated last year
- Code for CTO: A Large Clinical Trial Outcome and QA Dataset☆14Updated this week