aws-neuron / aws-neuron-reference-for-megatron-lm
☆14Updated last year
Related projects: ⓘ
- ☆16Updated 5 years ago
- Minimum Description Length probing for neural network representations☆15Updated 11 months ago
- QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning P…☆33Updated last year
- Ranking of fine-tuned HF models as base models.☆35Updated last year
- ☆18Updated last year
- ☆16Updated 10 months ago
- Engineering the state of RNN language models (Mamba, RWKV, etc.)☆31Updated 3 months ago
- ☆13Updated 3 years ago
- ☆12Updated 2 years ago
- My explorations into editing the knowledge and memories of an attention network☆34Updated last year
- Helper scripts and notes that were used while porting various nlp models☆44Updated 2 years ago
- Official code release for the paper Coder Reviewer Reranking for Code Generation.☆41Updated last year
- Plug-and-play Search Interfaces with Pyserini and Hugging Face☆32Updated last year
- SMASHED is a toolkit designed to apply transformations to samples in datasets, such as fields extraction, tokenization, prompting, batchi…☆30Updated 3 months ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆42Updated 10 months ago
- ☆46Updated 2 years ago
- ☆13Updated 5 years ago
- Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification☆11Updated last year
- Embedding Recycling for Language models☆38Updated last year
- Transformers at any scale☆39Updated 8 months ago
- An attempt to merge ESBN with Transformers, to endow Transformers with the ability to emergently bind symbols☆14Updated 3 years ago
- ☆20Updated 3 years ago
- [ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators☆24Updated last year
- Code for our paper Resources and Evaluations for Multi-Distribution Dense Information Retrieval☆14Updated 8 months ago
- Critique-out-Loud Reward Models☆17Updated 2 weeks ago
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆26Updated last year
- This repo contains code for the paper "Psychologically-informed chain-of-thought prompts for metaphor understanding in large language mod…☆12Updated last year
- Implementation of N-Grammer in Flax☆16Updated last year
- Adding new tasks to T0 without catastrophic forgetting☆30Updated last year
- ☆27Updated last year