HazyResearch / TARTLinks

TART: A plug-and-play Transformer module for task-agnostic reasoning

☆200

Alternatives and similar repositories for TART

Users that are interested in TART are comparing it to the libraries listed below

Sorting:

kernelmachine / cbtm
Code repository for the c-BTM paper
☆107Updated 2 years ago
IBM / ModuleFormer
ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward exp…
☆224Updated last month
tianjunz / HIR
☆159Updated 2 years ago
haoliuhl / chain-of-hindsight
Simple next-token-prediction for RLHF
☆226Updated 2 years ago
ruiqi-zhong / D5
The GitHub repo for Goal Driven Discovery of Distributional Differences via Language Descriptions
☆71Updated 2 years ago
CarperAI / autocrit
A repository for transformer critique learning and generation
☆88Updated last year
google / sycophancy-intervention
Scripts for generating synthetic finetuning data for reducing sycophancy.
☆116Updated 2 years ago
orhonovich / unnatural-instructions
☆179Updated 2 years ago
IBM / SALMON
Self-Alignment with Principle-Following Reward Models
☆168Updated last month
akoksal / LongForm
Reverse Instructions to generate instruction tuning data with corpus examples
☆214Updated last year
SALT-NLP / demonstrated-feedback
☆128Updated last year
hydrallm / llama-moe-v1
☆95Updated 2 years ago
Edward-Sun / RECITE
Code of ICLR paper: https://openreview.net/forum?id=-cqvvvb-NkI
☆94Updated 2 years ago
seonghyeonye / Flipped-Learning
[ICLR 2023] Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners
☆116Updated 3 months ago
bhargaviparanjape / language-programmes
☆173Updated 2 years ago
neelsjain / BYOD
The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"
☆107Updated 2 years ago
imoneoi / multipack
Multipack distributed sampler for fast padding-free training of LLMs
☆201Updated last year
lz1oceani / verify_cot
☆134Updated last year
ConsequentAI / fneval
Functional Benchmarks and the Reasoning Gap
☆89Updated last year
facebookresearch / Shepherd
This is the repo for the paper Shepherd -- A Critic for Language Model Generation
☆217Updated 2 years ago
r-three / phatgoose
Code for PHATGOOSE introduced in "Learning to Route Among Specialized Experts for Zero-Shot Generalization"
☆90Updated last year
McGill-NLP / length-generalization
Code for the paper "The Impact of Positional Encoding on Length Generalization in Transformers", NeurIPS 2023
☆136Updated last year
lukasberglund / reversal_curse
☆296Updated last year
KaiNylund / lm-weights-encode-time
☆69Updated last year
dwzhu-pku / PoSE
Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)
☆204Updated last year
jayelm / gisting
Learning to Compress Prompts with Gist Tokens - https://arxiv.org/abs/2304.08467
☆296Updated 8 months ago
huggingface / datablations
Scaling Data-Constrained Language Models
☆342Updated 3 months ago
kyleliang919 / Long-context-transformers
Exploring finetuning public checkpoints on filter 8K sequences on Pile
☆115Updated 2 years ago
Digitous / LLM-SLERP-Merge
Spherical Merge Pytorch/HF format Language Models with minimal feature loss.
☆138Updated 2 years ago
nyu-mll / ILF-for-code-generation
☆80Updated 7 months ago