microsoft / deep-language-networks
We view Large Language Models as stochastic language layers in a network, where the learnable parameters are the natural language prompts at each layer. We stack two such layers, feeding the output of one layer to the next. We call the stacked architecture a Deep Language Network - DLN
☆94Updated 9 months ago
Alternatives and similar repositories for deep-language-networks:
Users that are interested in deep-language-networks are comparing it to the libraries listed below
- RL algorithm: Advantage induced policy alignment☆65Updated last year
- SILO Language Models code repository☆81Updated last year
- A repository for transformer critique learning and generation☆90Updated last year
- ☆44Updated 5 months ago
- Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model☆45Updated last year
- ☆27Updated 9 months ago
- Repository for NPHardEval, a quantified-dynamic benchmark of LLMs☆53Updated last year
- Code release for Dataless Knowledge Fusion by Merging Weights of Language Models (https://openreview.net/forum?id=FCnohuR6AnM)☆88Updated last year
- Code of ICLR paper: https://openreview.net/forum?id=-cqvvvb-NkI☆94Updated 2 years ago
- Building modular LMs with parameter-efficient fine-tuning.☆103Updated this week
- The GitHub repo for Goal Driven Discovery of Distributional Differences via Language Descriptions☆69Updated 2 years ago
- ☆72Updated 11 months ago
- The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"☆108Updated last year
- ☆46Updated last year
- [EMNLP'23] Execution-Based Evaluation for Open Domain Code Generation☆47Updated last year
- A framework for few-shot evaluation of autoregressive language models.☆104Updated last year
- A unified benchmark for math reasoning☆87Updated 2 years ago
- the instructions and demonstrations for building a formal logical reasoning capable GLM☆53Updated 7 months ago
- This repository includes code for the paper "Does Localization Inform Editing? Surprising Differences in Where Knowledge Is Stored vs. Ca…☆60Updated last year
- [ICML 2023] Exploring the Benefits of Training Expert Language Models over Instruction Tuning☆98Updated 2 years ago
- The official code of EMNLP 2022, "SCROLLS: Standardized CompaRison Over Long Language Sequences".☆69Updated last year
- Language models scale reliably with over-training and on downstream tasks☆96Updated last year
- [NeurIPS 2024] Goldfish Loss: Mitigating Memorization in Generative LLMs☆84Updated 5 months ago
- ☆45Updated last year
- ☆38Updated last year
- [AAAI 2024] Investigating the Effectiveness of Task-Agnostic Prefix Prompt for Instruction Following☆79Updated 7 months ago
- Code and data accompanying our paper on arXiv "Faithful Chain-of-Thought Reasoning".☆158Updated 11 months ago
- Code repository for the c-BTM paper☆106Updated last year
- Skill-It! A Data-Driven Skills Framework for Understanding and Training Language Models☆46Updated last year
- ☆120Updated 6 months ago