Download, parse, and filter data PubMed, data-ready for The-Pile
☆23Dec 16, 2021Updated 4 years ago
Alternatives and similar repositories for The-Pile-PubMed
Users that are interested in The-Pile-PubMed are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A script for collecting the PubMed Central dataset in a language modelling friendly format.☆26Feb 16, 2021Updated 5 years ago
- EMNLP 2020: Filtering before Iteratively Referring for Knowledge-Grounded Response Selection in Retrieval-Based Chatbots☆12Dec 15, 2020Updated 5 years ago
- ☆42May 23, 2023Updated 3 years ago
- This repository contains codes for *Sem 2023 paper “Generative Data Augmentation for Aspect Sentiment Quad Prediction”.☆10May 30, 2023Updated 3 years ago
- [ACL 2023] Code for ContraCLM: Contrastive Learning For Causal Language Model☆35Dec 20, 2023Updated 2 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆19Mar 6, 2023Updated 3 years ago
- Official Code for ACL 2023 paper: "Ethicist: Targeted Training Data Extraction Through Loss Smoothed Soft Prompting and Calibrated Confid…☆24May 8, 2023Updated 3 years ago
- ☆11Oct 2, 2024Updated last year
- Data and code for the preprint "In-Context Learning with Long-Context Models: An In-Depth Exploration"☆44Aug 20, 2024Updated last year
- ☆26Aug 18, 2023Updated 2 years ago
- Pipeline for analyzing rare mutations in metagenome-assembled genomes☆10Apr 4, 2025Updated last year
- The implementation for "Open Relation Modeling: Learning to Define Relations between Entities" (Findings of ACL '22)☆12Feb 28, 2022Updated 4 years ago
- The evaluation code for the paper "MoreHopQA: More Than Multi-hop Reasoning"☆15Jun 21, 2024Updated last year
- ☆25Nov 14, 2022Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- 🤡 An up-to-date & curated list of awesome KBQA papers, methods & resources.☆10Jul 14, 2022Updated 3 years ago
- Code for ACL 2022 long paper: Can Prompt Probe Pretrained Language Models? Understanding the Invisible Risks from a Causal View☆10May 17, 2022Updated 4 years ago
- ☆12Feb 26, 2020Updated 6 years ago
- PubMedQA: A Dataset for Biomedical Research Question Answering☆426Apr 18, 2023Updated 3 years ago
- Minimum viable code for the Decodable Information Bottleneck paper. Pytorch Implementation.☆12Oct 20, 2020Updated 5 years ago
- Companion repository to "Prompt Compression and Contrastive Conditioning for Controllability and Toxicity Reduction in Language Models"☆14May 31, 2023Updated 3 years ago
- [COLM '25] Single-Pass Document Scanning for Question Answering☆14Aug 20, 2025Updated 9 months ago
- ☆14Nov 23, 2020Updated 5 years ago
- SciRepEval benchmark training and evaluation scripts☆91May 5, 2026Updated last month
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Official implementation of Privacy Implications of Retrieval-Based Language Models (EMNLP 2023). https://arxiv.org/abs/2305.14888☆37Jun 10, 2024Updated 2 years ago
- ☆22Mar 19, 2021Updated 5 years ago
- ☆10Oct 2, 2024Updated last year
- ☆13Jan 20, 2023Updated 3 years ago
- Code repository for the c-BTM paper☆109Sep 26, 2023Updated 2 years ago
- Official code for the ACL 2024 paper: Chat Vector: A Simple Approach to Equip LLMs with Instruction Following and Model Alignment in New …☆61May 22, 2024Updated 2 years ago
- CATH: high-throughput protein structure/function annotations☆12Dec 17, 2019Updated 6 years ago
- a precise pangenome browser combining linear and graph-based pan-genome☆13Jul 16, 2024Updated last year
- Utility functions for weights and biases (wandb).☆11Sep 17, 2024Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- BioT5 (EMNLP 2023) and BioT5+ (ACL 2024 Findings)☆126Sep 14, 2024Updated last year
- Entity linker for the newspaper collection of the National Library of the Netherlands. Links named entity mentions to DBpedia description…☆11Dec 8, 2022Updated 3 years ago
- Recent application of graph neural network in drug discovery☆14Mar 19, 2020Updated 6 years ago
- 一个基于 Cloudflare Workers 的 OpenAI API 代理服务,支持多渠道管理、Token 管理和使用量统计☆28Apr 26, 2026Updated last month
- Robust individual and aggregate checksums for nucleotide sequences☆17Mar 3, 2026Updated 3 months ago
- Repository for ACL 2022 paper Mix and Match: Learning-free Controllable Text Generation using Energy Language Models☆46Mar 13, 2022Updated 4 years ago
- Official implementation for "Pruning Randomly Initialized Neural Networks with Iterative Randomization"☆10Oct 5, 2021Updated 4 years ago