microsoft / deep-language-networksLinks
We view Large Language Models as stochastic language layers in a network, where the learnable parameters are the natural language prompts at each layer. We stack two such layers, feeding the output of one layer to the next. We call the stacked architecture a Deep Language Network - DLN
☆94Updated 11 months ago
Alternatives and similar repositories for deep-language-networks
Users that are interested in deep-language-networks are comparing it to the libraries listed below
Sorting:
- SILO Language Models code repository☆81Updated last year
- The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"☆107Updated last year
- Code of ICLR paper: https://openreview.net/forum?id=-cqvvvb-NkI☆94Updated 2 years ago
- RL algorithm: Advantage induced policy alignment☆65Updated last year
- Code for the ICLR 2024 paper "How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions"☆71Updated last year
- Code release for Dataless Knowledge Fusion by Merging Weights of Language Models (https://openreview.net/forum?id=FCnohuR6AnM)☆89Updated last year
- A repository for transformer critique learning and generation☆90Updated last year
- ☆44Updated 8 months ago
- [AAAI 2024] Investigating the Effectiveness of Task-Agnostic Prefix Prompt for Instruction Following☆79Updated 10 months ago
- ☆124Updated 9 months ago
- The GitHub repo for Goal Driven Discovery of Distributional Differences via Language Descriptions☆70Updated 2 years ago
- Repository for NPHardEval, a quantified-dynamic benchmark of LLMs☆56Updated last year
- Code and data accompanying our paper on arXiv "Faithful Chain-of-Thought Reasoning".☆161Updated last year
- The repository contains code for Adaptive Data Optimization☆25Updated 7 months ago
- Code for PHATGOOSE introduced in "Learning to Route Among Specialized Experts for Zero-Shot Generalization"☆86Updated last year
- ☆75Updated last year
- [ICLR 2023] Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners☆116Updated 2 weeks ago
- [EMNLP'23] Execution-Based Evaluation for Open Domain Code Generation☆48Updated last year
- ☆29Updated last year
- PASTA: Post-hoc Attention Steering for LLMs☆121Updated 7 months ago
- [ICML 2023] Exploring the Benefits of Training Expert Language Models over Instruction Tuning☆98Updated 2 years ago
- Language models scale reliably with over-training and on downstream tasks☆97Updated last year
- Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model☆43Updated last year
- This repository includes code for the paper "Does Localization Inform Editing? Surprising Differences in Where Knowledge Is Stored vs. Ca…☆61Updated 2 years ago
- Self-Alignment with Principle-Following Reward Models☆162Updated 2 months ago
- A dataset of LLM-generated chain-of-thought steps annotated with mistake location.☆81Updated 11 months ago
- Code repository for the c-BTM paper☆106Updated last year
- Code accompanying the paper Pretraining Language Models with Human Preferences☆182Updated last year
- Inspecting and Editing Knowledge Representations in Language Models☆116Updated last year
- Code for the arXiv paper: "LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond"☆59Updated 5 months ago