microsoft / deep-language-networksLinks
We view Large Language Models as stochastic language layers in a network, where the learnable parameters are the natural language prompts at each layer. We stack two such layers, feeding the output of one layer to the next. We call the stacked architecture a Deep Language Network - DLN
☆94Updated last year
Alternatives and similar repositories for deep-language-networks
Users that are interested in deep-language-networks are comparing it to the libraries listed below
Sorting:
- ☆44Updated 11 months ago
- SILO Language Models code repository☆83Updated last year
- The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"☆107Updated 2 years ago
- Repository for NPHardEval, a quantified-dynamic benchmark of LLMs☆59Updated last year
- Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model☆43Updated 3 weeks ago
- RL algorithm: Advantage induced policy alignment☆65Updated 2 years ago
- A repository for transformer critique learning and generation☆88Updated last year
- ☆76Updated last year
- ☆32Updated last year
- [EMNLP'23] Execution-Based Evaluation for Open Domain Code Generation☆49Updated last year
- Code of ICLR paper: https://openreview.net/forum?id=-cqvvvb-NkI☆94Updated 2 years ago
- ☆149Updated last year
- The GitHub repo for Goal Driven Discovery of Distributional Differences via Language Descriptions☆71Updated 2 years ago
- Language models scale reliably with over-training and on downstream tasks☆100Updated last year
- Code for the ICLR 2024 paper "How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions"☆71Updated last year
- [ICLR 2023] Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners☆116Updated 4 months ago
- The repository contains code for Adaptive Data Optimization☆26Updated 10 months ago
- ☆45Updated 2 years ago
- ☆11Updated 2 years ago
- Code for the paper "The Impact of Positional Encoding on Length Generalization in Transformers", NeurIPS 2023☆136Updated last year
- ☆69Updated last year
- Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"☆48Updated last year
- Code for PHATGOOSE introduced in "Learning to Route Among Specialized Experts for Zero-Shot Generalization"☆90Updated last year
- Few-shot Learning with Auxiliary Data☆31Updated last year
- Aioli: A unified optimization framework for language model data mixing☆28Updated 9 months ago
- ☆55Updated 2 years ago
- ☆80Updated 7 months ago
- Code repository for the c-BTM paper☆107Updated 2 years ago
- Advanced Reasoning Benchmark Dataset for LLMs☆46Updated last year
- ☆39Updated last year