microsoft / deep-language-networksLinks

We view Large Language Models as stochastic language layers in a network, where the learnable parameters are the natural language prompts at each layer. We stack two such layers, feeding the output of one layer to the next. We call the stacked architecture a Deep Language Network - DLN

☆94

Alternatives and similar repositories for deep-language-networks

Users that are interested in deep-language-networks are comparing it to the libraries listed below

Sorting:

neelsjain / BYOD
The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"
☆107Updated 2 years ago
kernelmachine / silo-lm
SILO Language Models code repository
☆83Updated last year
EleutherAI / semantic-memorization
☆44Updated last year
casmlab / NPHardEval
Repository for NPHardEval, a quantified-dynamic benchmark of LLMs
☆60Updated last year
r-three / RAD
Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model
☆44Updated last month
Edward-Sun / RECITE
Code of ICLR paper: https://openreview.net/forum?id=-cqvvvb-NkI
☆95Updated 2 years ago
hadasah / btm
☆76Updated last year
LoryPack / LLM-LieDetector
Code for the ICLR 2024 paper "How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions"
☆71Updated last year
zorazrw / odex
[EMNLP'23] Execution-Based Evaluation for Open Domain Code Generation
☆49Updated last year
XiangLi1999 / AutoBencher
☆32Updated last year
CarperAI / autocrit
A repository for transformer critique learning and generation
☆89Updated last year
QingruZhang / PASTA
PASTA: Post-hoc Attention Steering for LLMs
☆128Updated 11 months ago
ruiqi-zhong / D5
The GitHub repo for Goal Driven Discovery of Distributional Differences via Language Descriptions
☆71Updated 2 years ago
kernelmachine / cbtm
Code repository for the c-BTM paper
☆108Updated 2 years ago
microsoft / RLHF-APA
RL algorithm: Advantage induced policy alignment
☆65Updated 2 years ago
neulab / gemini-benchmark
☆150Updated last year
seonghyeonye / Flipped-Learning
[ICLR 2023] Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners
☆116Updated 4 months ago
SALT-NLP / demonstrated-feedback
☆129Updated last year
WHGTyen / BIG-Bench-Mistake
A dataset of LLM-generated chain-of-thought steps annotated with mistake location.
☆83Updated last year
r-three / phatgoose
Code for PHATGOOSE introduced in "Learning to Route Among Specialized Experts for Zero-Shot Generalization"
☆91Updated last year
allenai / easy-to-hard-generalization
Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"
☆48Updated last year
guy-dar / embedding-space
☆55Updated 2 years ago
alon-albalak / FLAD
Few-shot Learning with Auxiliary Data
☆31Updated last year
ahans30 / goldfish-loss
[NeurIPS 2024] Goldfish Loss: Mitigating Memorization in Generative LLMs
☆92Updated last year
princeton-nlp / TransformerPrograms
[NeurIPS 2023] Learning Transformer Programs
☆162Updated last year
niansong1996 / lever
Code for paper "LEVER: Learning to Verifiy Language-to-Code Generation with Execution" (ICML'23)
☆90Updated 2 years ago
yidingjiang / ado
The repository contains code for Adaptive Data Optimization
☆28Updated 11 months ago
hamishivi / EasyLM
Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…
☆75Updated last year
mlfoundations / scaling
Language models scale reliably with over-training and on downstream tasks
☆100Updated last year
TheDuckAI / arb
Advanced Reasoning Benchmark Dataset for LLMs
☆47Updated 2 years ago