shreyansh26 / LLM-Sampling
A collection of various LLM sampling methods implemented in pure Pytorch
☆19Updated last month
Alternatives and similar repositories for LLM-Sampling:
Users that are interested in LLM-Sampling are comparing it to the libraries listed below
- ☆47Updated 5 months ago
- PyTorch implementation for MRL☆18Updated 11 months ago
- MEXMA: Token-level objectives improve sentence representations☆38Updated 3 weeks ago
- ☆24Updated last year
- ☆48Updated 2 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆48Updated 6 months ago
- One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation☆36Updated 3 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆53Updated 5 months ago
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated 10 months ago
- Arrakis is a library to conduct, track and visualize mechanistic interpretability experiments.☆24Updated last month
- Repository containing awesome resources regarding Hugging Face tooling.☆46Updated last year
- Utilities for PyTorch distributed☆23Updated last year
- Improving Text Embedding of Language Models Using Contrastive Fine-tuning☆59Updated 5 months ago
- Supercharge huggingface transformers with model parallelism.☆76Updated 3 months ago
- Aioli: A unified optimization framework for language model data mixing☆19Updated 2 weeks ago
- Code for the examples presented in the talk "Training a Llama in your backyard: fine-tuning very large models on consumer hardware" given…☆14Updated last year
- LLM training in simple, raw C/CUDA☆14Updated last month
- Official repository of "LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging"☆23Updated 2 months ago
- Mixtral finetuning☆19Updated 11 months ago
- Prune transformer layers☆67Updated 8 months ago
- ☆37Updated last year
- ReBase: Training Task Experts through Retrieval Based Distillation☆28Updated 6 months ago
- ☆76Updated 7 months ago
- An introduction to LLM Sampling☆75Updated last month
- ☆31Updated last year
- QLoRA for Masked Language Modeling☆21Updated last year
- Minimum Description Length probing for neural network representations☆18Updated this week
- Code and pretrained models for the paper: "MatMamba: A Matryoshka State Space Model"☆56Updated 2 months ago
- This repo is based on https://github.com/jiaweizzhao/GaLore☆23Updated 4 months ago