alexrs / herdLinks

Mixture of Expert (MoE) techniques for enhancing LLM performance through expert-driven prompt mapping and adapter combinations.

☆12

Alternatives and similar repositories for herd

Users that are interested in herd are comparing it to the libraries listed below

Sorting:

ElleLeonne / Lightning-ReLoRA
A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.
☆35Updated last year
scottlogic-alex / prm800k-denorm
Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format
☆27Updated 2 years ago
kyegomez / LM-Infinite
Implementation of "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"
☆39Updated 11 months ago
arcee-ai / DAM
☆55Updated 11 months ago
ContextualAI / CLAIR_and_APO
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment
☆60Updated last year
r-three / RAD
Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model
☆43Updated 3 weeks ago
TRI-ML / linear_open_lm
A repository for research on medium sized language models.
☆78Updated last year
HazyResearch / aioli
Aioli: A unified optimization framework for language model data mixing
☆28Updated 9 months ago
EleutherAI / mdl
Minimum Description Length probing for neural network representations
☆20Updated 9 months ago
Zyphra / Zyda_processing
☆39Updated last year
LLM360 / crystalcoder-data-prep
Data preparation code for CrystalCoder 7B LLM
☆45Updated last year
Tebmer / Rereading-LLM-Reasoning
EMNLP 2024 "Re-reading improves reasoning in large language models". Simply repeating the question to get bidirectional understanding for…
☆27Updated 10 months ago
kumar-shridhar / Screws
SCREWS: A Modular Framework for Reasoning with Revisions
☆27Updated 2 years ago
yidingjiang / ado
The repository contains code for Adaptive Data Optimization
☆26Updated 10 months ago
kaiokendev / cutoff-len-is-context-len
Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit
☆62Updated 2 years ago
ctlllll / understanding_llm_benchmarks
Understanding the correlation between different LLM benchmarks
☆29Updated last year
recursal / GoldFinch-paper
GoldFinch and other hybrid transformer components
☆45Updated last year
tanyuqian / cappy
NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer
☆44Updated last year
facebookresearch / lss_eval
This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…
☆31Updated 2 years ago
RobertCsordas / moe
Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"
☆38Updated 4 months ago
nlp-uoregon / ullme
☆20Updated 6 months ago
huggingface / peft-pytorch-conference
Code for the examples presented in the talk "Training a Llama in your backyard: fine-tuning very large models on consumer hardware" given…
☆14Updated 2 years ago
kyegomez / Infini-attention
Implementation of the paper: "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" from Google in pyTO…
☆56Updated last week
LLM360 / Analysis360
Open Implementations of LLM Analyses
☆107Updated last year
KaiNylund / lm-weights-encode-time
☆69Updated last year
xingyaoww / LeTI
Official repo for NAACL 2024 Findings paper "LeTI: Learning to Generate from Textual Interactions."
☆64Updated 2 years ago
dinobby / MAGDi
The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…
☆37Updated last year
kyegomez / Reka-Torch
Implementation of the model: "Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models" in PyTorch
☆28Updated last week
eth-easl / fmengine
Utilities for Training Very Large Models
☆58Updated last year
catid / lllm
Latent Large Language Models
☆19Updated last year