aws-neuron / aws-neuron-reference-for-megatron-lmLinks
☆14Updated last year
Alternatives and similar repositories for aws-neuron-reference-for-megatron-lm
Users that are interested in aws-neuron-reference-for-megatron-lm are comparing it to the libraries listed below
Sorting:
- Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification☆11Updated last year
- Code for our paper Resources and Evaluations for Multi-Distribution Dense Information Retrieval☆14Updated last year
- ☆16Updated 5 years ago
- Resources accompanying the "Zero-Shot Recommendation as Language Modeling" paper (ECIR2022)☆13Updated 2 years ago
- interactive explorer for language models☆9Updated 6 years ago
- ☆16Updated last year
- [ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators☆24Updated last year
- ☆9Updated last year
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…☆31Updated last year
- This repo contains code for the paper "Psychologically-informed chain-of-thought prompts for metaphor understanding in large language mod…☆14Updated 2 years ago
- Generative Retrieval Transformer☆28Updated last year
- Transformers at any scale☆41Updated last year
- Implementation of the model: "Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models" in PyTorch☆29Updated last week
- ☆12Updated 3 years ago
- Minimum Description Length probing for neural network representations☆19Updated 4 months ago
- [NeurIPS 2023] Sparse Modular Activation for Efficient Sequence Modeling☆36Updated last year
- Large-scale query-focused multi-document Summarization dataset☆10Updated 3 years ago
- QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning P…☆34Updated last year
- Some microbenchmarks and design docs before commencement☆12Updated 4 years ago
- Helper scripts and notes that were used while porting various nlp models☆46Updated 3 years ago
- RiddleSense: Reasoning about Riddle Questions Featuring Linguistic Creativity and Commonsense Knowledge☆15Updated 3 years ago
- Neural Unification for Logic Reasoning over Language☆22Updated 3 years ago
- ☆17Updated last year
- Source-to-Source Debuggable Derivatives in Pure Python☆15Updated last year
- [ICLR'25] "Attention in Large Language Models Yields Efficient Zero-Shot Re-Rankers"☆19Updated 2 months ago
- Learning to Model Editing Processes☆26Updated 3 years ago
- A Benchmark for Robust, Multi-evidence, Multi-answer Question Answering☆16Updated 2 years ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆48Updated last year
- An attempt to merge ESBN with Transformers, to endow Transformers with the ability to emergently bind symbols☆16Updated 3 years ago
- ☆28Updated 2 years ago