microsoft / ARXGEN
Scripts to parse arxiv documents for NLP tasks
☆17Updated last year
Alternatives and similar repositories for ARXGEN:
Users that are interested in ARXGEN are comparing it to the libraries listed below
- ☆14Updated 7 months ago
- Code for our paper Resources and Evaluations for Multi-Distribution Dense Information Retrieval☆14Updated last year
- Code, datasets and results of the ChatGPT evaluation presented in paper "ChatGPT: Jack of all trades, master of none"☆29Updated 2 years ago
- A simple semantic search engine for scientific papers.☆28Updated last year
- ☆16Updated last year
- Code for the paper "Code-Mixing on Sesame Street: Dawn of the Adversarial Polyglots" (NAACL-HLT 2021)☆10Updated 3 years ago
- Hugging Face and Pyserini interoperability☆20Updated last year
- The Implementation for the Paper "Time-Stamped Language Model: Teaching Language Models toUnderstand The Flow of Events"☆11Updated 3 years ago
- Code for running the experiments in Deep Subjecthood: Higher Order Grammatical Features in Multilingual BERT☆17Updated last year
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆18Updated 2 years ago
- Documentation effort for the BookCorpus dataset☆34Updated 3 years ago
- A curated list of papers exploring the limits of deep learning for NLP☆24Updated 7 years ago
- NLG Best Practices for Data-Efficient Modeling How to Train Production-Ready Models with Little Data☆10Updated 3 years ago
- ECIR'21: Simplified TinyBERT: Knowledge Distillation for Document Retrieval☆16Updated 3 years ago
- Large-scale query-focused multi-document Summarization dataset☆10Updated 3 years ago
- Code for our ACL '20 paper "Representation Engineering with Natural Language Explanations"☆29Updated 4 years ago
- SMASHED is a toolkit designed to apply transformations to samples in datasets, such as fields extraction, tokenization, prompting, batchi…☆33Updated 10 months ago
- Official demo repository for our ACL 2019 long paper "Generating Question-Answer Hierarchies".☆19Updated 4 years ago
- code for paper "Accessing higher dimensions for unsupervised word translation"☆21Updated last year
- PyTorch library for synthesizing programs from natural language☆18Updated 8 months ago
- Generative Retrieval Transformer☆28Updated last year
- Repository for Skill Set Optimization☆12Updated 8 months ago
- Implementation of the model: "Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models" in PyTorch☆30Updated this week
- This repo contains datasets and code for Assessing Phrasal Representation and Composition in Transformers, by Lang Yu and Allyson Ettinge…☆11Updated 3 years ago
- Mapping natural language commands to web elements☆37Updated 2 years ago
- ☆11Updated 2 years ago
- ☆26Updated 5 years ago
- Unifew: Unified Fewshot Learning Model☆18Updated 3 years ago
- A file utility for accessing both local and remote files through a unified interface.☆40Updated this week
- URL downloader supporting checkpointing and continuous checksumming.☆19Updated last year