andersonbcdefg / dpo-lora
direct preference optimization with only 1 model copy :)
☆13Updated last year
Alternatives and similar repositories for dpo-lora:
Users that are interested in dpo-lora are comparing it to the libraries listed below
- ☆60Updated last year
- ☆31Updated 5 months ago
- ☆58Updated 9 months ago
- A new way to generate large quantities of high quality synthetic data (on par with GPT-4), with better controllability, at a fraction of …☆22Updated 4 months ago
- Zeus LLM Trainer is a rewrite of Stanford Alpaca aiming to be the trainer for all Large Language Models☆69Updated last year
- Nexusflow function call, tool use, and agent benchmarks.☆19Updated 2 months ago
- Data preparation code for CrystalCoder 7B LLM☆44Updated 9 months ago
- The code repository for the CURLoRA research paper. Stable LLM continual fine-tuning and catastrophic forgetting mitigation.☆41Updated 5 months ago
- ☆22Updated last year
- ☆62Updated 4 months ago
- ☆48Updated last year
- NanoGPT (124M) quality in 2.67B tokens☆27Updated this week
- Score LLM pretraining data with classifiers☆54Updated last year
- Functional Benchmarks and the Reasoning Gap☆82Updated 4 months ago
- alternative way to calculating self attention☆18Updated 8 months ago
- A repository of prompts and Python scripts for intelligent transformation of raw text into diverse formats.☆30Updated last year
- Code for PHATGOOSE introduced in "Learning to Route Among Specialized Experts for Zero-Shot Generalization"☆81Updated 11 months ago
- Public Inflection Benchmarks☆69Updated 11 months ago
- [ICML 2023] "Outline, Then Details: Syntactically Guided Coarse-To-Fine Code Generation", Wenqing Zheng, S P Sharan, Ajay Kumar Jaiswal, …☆40Updated last year
- From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients. Ajay Jaiswal, Lu Yin, Zhenyu Zhang, Shiwei Liu,…☆42Updated 7 months ago
- ☆81Updated last year
- Simplex Random Feature attention, in PyTorch☆74Updated last year
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆24Updated 11 months ago
- ☆37Updated 6 months ago
- Turing machines, Rule 110, and A::B reversal using Claude 3 Opus.☆59Updated 9 months ago
- Testing paligemma2 finetuning on reasoning dataset☆18Updated last month
- ☆67Updated 6 months ago
- Scripts for generating synthetic finetuning data for reducing sycophancy.☆107Updated last year
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…☆25Updated 3 months ago