microsoft / deep-language-networksLinks
We view Large Language Models as stochastic language layers in a network, where the learnable parameters are the natural language prompts at each layer. We stack two such layers, feeding the output of one layer to the next. We call the stacked architecture a Deep Language Network - DLN
☆94Updated 10 months ago
Alternatives and similar repositories for deep-language-networks
Users that are interested in deep-language-networks are comparing it to the libraries listed below
Sorting:
- SILO Language Models code repository☆81Updated last year
- [EMNLP'23] Execution-Based Evaluation for Open Domain Code Generation☆48Updated last year
- RL algorithm: Advantage induced policy alignment☆65Updated last year
- ☆29Updated 10 months ago
- Building modular LMs with parameter-efficient fine-tuning.☆105Updated this week
- Repository for NPHardEval, a quantified-dynamic benchmark of LLMs☆54Updated last year
- ☆48Updated last year
- Self-Alignment with Principle-Following Reward Models☆161Updated last month
- Code of ICLR paper: https://openreview.net/forum?id=-cqvvvb-NkI☆94Updated 2 years ago
- [ICLR 2023] Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners☆116Updated 8 months ago
- The official code of EMNLP 2022, "SCROLLS: Standardized CompaRison Over Long Language Sequences".☆69Updated last year
- This repository includes code for the paper "Does Localization Inform Editing? Surprising Differences in Where Knowledge Is Stored vs. Ca…☆61Updated 2 years ago
- The GitHub repo for Goal Driven Discovery of Distributional Differences via Language Descriptions☆70Updated 2 years ago
- 👻 Code and benchmark for our EMNLP 2023 paper - "FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions"☆55Updated last year
- A repository for transformer critique learning and generation☆90Updated last year
- ☆74Updated last year
- Code for the ICLR 2024 paper "How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions"☆70Updated 11 months ago
- Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"☆47Updated last year
- [AAAI 2024] Investigating the Effectiveness of Task-Agnostic Prefix Prompt for Instruction Following☆79Updated 8 months ago
- ☆44Updated 6 months ago
- The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"☆108Updated last year
- Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model☆43Updated last year
- ☆159Updated 2 years ago
- A unified benchmark for math reasoning☆88Updated 2 years ago
- ☆120Updated 8 months ago
- Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…☆75Updated 9 months ago
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆54Updated last year
- ☆44Updated 9 months ago
- [NeurIPS 2023 Main Track] This is the repository for the paper titled "Don’t Stop Pretraining? Make Prompt-based Fine-tuning Powerful Lea…☆74Updated last year
- Code release for Dataless Knowledge Fusion by Merging Weights of Language Models (https://openreview.net/forum?id=FCnohuR6AnM)☆89Updated last year