andersonbcdefg / dpo-lora
direct preference optimization with only 1 model copy :)
☆14Updated last year
Alternatives and similar repositories for dpo-lora:
Users that are interested in dpo-lora are comparing it to the libraries listed below
- Data preparation code for CrystalCoder 7B LLM☆44Updated 10 months ago
- Synthetic data derived by templating, few shot prompting, transformations on public domain corpora, and monte carlo tree search.☆31Updated last month
- ☆60Updated last year
- Code Implementation, Evaluations, Documentation, Links and Resources for Min P paper☆28Updated 2 weeks ago
- Public Inflection Benchmarks☆68Updated last year
- Simplex Random Feature attention, in PyTorch☆74Updated last year
- Small, simple agent task environments for training and evaluation☆18Updated 4 months ago
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆56Updated last week
- Official repo for NAACL 2024 Findings paper "LeTI: Learning to Generate from Textual Interactions."☆63Updated last year
- ☆63Updated 6 months ago
- ☆16Updated 2 months ago
- Turing machines, Rule 110, and A::B reversal using Claude 3 Opus.☆59Updated 10 months ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆39Updated last month
- Functional Benchmarks and the Reasoning Gap☆84Updated 5 months ago
- ☆22Updated last year
- [WIP] Transformer to embed Danbooru labelsets☆13Updated 11 months ago
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆91Updated 3 weeks ago
- Score LLM pretraining data with classifiers☆54Updated last year
- ☆20Updated 4 months ago
- ☆60Updated 11 months ago
- The code repository for the CURLoRA research paper. Stable LLM continual fine-tuning and catastrophic forgetting mitigation.☆43Updated 7 months ago
- Code for PHATGOOSE introduced in "Learning to Route Among Specialized Experts for Zero-Shot Generalization"☆82Updated last year
- Synthetic data generation and benchmark implementation for "Episodic Memories Generation and Evaluation Benchmark for Large Language Mode…☆37Updated this week
- A 7B parameter model for mathematical reasoning☆23Updated last month
- ☆17Updated 2 weeks ago
- A repository of prompts and Python scripts for intelligent transformation of raw text into diverse formats.☆30Updated last year
- A synthetic story narration dataset to study small audio LMs.☆32Updated last year
- ☆48Updated last year
- Code for our paper PAPILLON: PrivAcy Preservation from Internet-based and Local Language MOdel ENsembles☆22Updated 3 months ago
- alternative way to calculating self attention☆18Updated 10 months ago