microsoft / ARXGEN
Scripts to parse arxiv documents for NLP tasks
☆17Updated last year
Related projects ⓘ
Alternatives and complementary repositories for ARXGEN
- Hugging Face and Pyserini interoperability☆19Updated last year
- [ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators☆24Updated last year
- Code for the paper "Code-Mixing on Sesame Street: Dawn of the Adversarial Polyglots" (NAACL-HLT 2021)☆10Updated 2 years ago
- CyBERTron-LM is a project which collects some pre-trained Transformer-based models.☆12Updated last year
- The code of EMNLP 2019 paper "A Split-and-Recombine Approach for Follow-up Query Analysis"☆17Updated last year
- Towards Semantics-Enhanced Pre-Training: Can Lexicon Definitions Help Learning Sentence Meanings? (AAAI 2021)☆9Updated 3 years ago
- Generative Retrieval Transformer☆29Updated last year
- PyTorch library for synthesizing programs from natural language☆18Updated 3 months ago
- code for paper "Accessing higher dimensions for unsupervised word translation"☆21Updated last year
- A file utility for accessing both local and remote files through a unified interface.☆35Updated 3 months ago
- Search-based-Neural-Structured-Learning-for-Sequential-Question-Answering☆32Updated last year
- The code repository associated with the NeurIPS 2020 paper: "Towards Neural Programming Interfaces"☆13Updated 2 years ago
- An attempt to merge ESBN with Transformers, to endow Transformers with the ability to emergently bind symbols☆14Updated 3 years ago
- PROSE Public Benchmark Suite☆24Updated last month
- Multilingual Compositional Wikidata Questions (MCWQ)☆18Updated last year
- Scripts supporting the development and serving the Roots Search Tool - https://hf.co/spaces/bigscience-data/roots-search☆10Updated last year
- ☆33Updated last year
- NUANCED is a user-centric conversational recommendation dataset that contains 5.1k annotated dialogues and 26k high-quality user turns.☆18Updated 3 years ago
- >>PhysWikiQuiz<< - a Physics Question Generation and Interrogation System☆11Updated last year
- ☆14Updated 3 years ago
- ☆17Updated last year
- BANG is a new pretraining model to Bridge the gap between Autoregressive (AR) and Non-autoregressive (NAR) Generation. AR and NAR generat…☆28Updated 2 years ago
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…☆31Updated last year
- This repo contains data and code for the paper "Reasoning over Public and Private Data in Retrieval-Based Systems."☆46Updated 3 months ago
- A generic library for crafting adversarial NLP examples - WIP☆40Updated 6 years ago
- Simplifying parsing of large jsonline files in NLP Workflows☆12Updated 2 years ago
- NLG Best Practices for Data-Efficient Modeling How to Train Production-Ready Models with Little Data☆11Updated 3 years ago
- ☆12Updated 3 years ago
- Data and code accompanying the paper "Intent Detection with WikiHow"☆10Updated 3 years ago
- ☆36Updated 3 months ago