andersonbcdefg / dpo-loraLinks

direct preference optimization with only 1 model copy :)

☆14

Alternatives and similar repositories for dpo-lora

Users that are interested in dpo-lora are comparing it to the libraries listed below

Sorting:

PrimeIntellect-ai / prime-vllm
Modded vLLM to run pipeline parallelism over public networks
☆37Updated last month
SpellcraftAI / turing
Turing machines, Rule 110, and A::B reversal using Claude 3 Opus.
☆58Updated last year
PrimeIntellect-ai / toploc
TOPLOC: is a novel method for verifiable inference that enables users to verify that LLM providers are using the correct model configurat…
☆34Updated 2 months ago
PrimeIntellect-ai / INTELLECT-MATH
A 7B parameter model for mathematical reasoning
☆40Updated 4 months ago
PrimeIntellect-ai / pccl
PCCL (Prime Collective Communications Library) implements fault tolerant collective communications over IP
☆96Updated last month
teknium1 / RawTransform
A repository of prompts and Python scripts for intelligent transformation of raw text into diverse formats.
☆30Updated 2 years ago
VikParuchuri / classified
Score LLM pretraining data with classifiers
☆55Updated last year
PrimeIntellect-ai / prime-cli
The Prime Intellect CLI provides a powerful command-line interface for managing GPU resources across various providers
☆29Updated last month
s-smits / grpo-optuna
Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna
☆54Updated 5 months ago
nano-R1 / resources
Compiling useful links, papers, benchmarks, ideas, etc.
☆46Updated 3 months ago
Danau5tin / calculator_agent_rl
Training an LLM to use a calculator with multi-turn reinforcement learning, achieving a **62% absolute increase in evaluation accuracy**.
☆41Updated 2 months ago
xjdr-alt / llmri
look how they massacred my boy
☆63Updated 8 months ago
thesephist / spectre
Sparse autoencoders for Contra text embedding models
☆25Updated last year
JD-P / RetroInstruct
Synthetic data derived by templating, few shot prompting, transformations on public domain corpora, and monte carlo tree search.
☆32Updated 4 months ago
joey00072 / Attention-as-graph
alternative way to calculating self attention
☆18Updated last year
google-deepmind / latent-multi-hop-reasoning
[ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?
☆71Updated 3 months ago
HazyResearch / train-tk
train with kittens!
☆61Updated 8 months ago
JoeLi12345 / nGPT
an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)
☆101Updated 4 months ago
1rgs / token-trekker-rs
☆13Updated 2 years ago
PrimeIntellect-ai / genesys
☆128Updated 3 months ago
teknium1 / LLM-Benchmark-Logs
Just a bunch of benchmark logs for different LLMs
☆119Updated 11 months ago
xjdr-alt / muzero_sketch
☆38Updated 11 months ago
yizhe-ang / interactive-transformer
A visual interface for understanding and interpreting Transformers
☆77Updated last year
notarussianteenager / srf-attention
Simplex Random Feature attention, in PyTorch
☆74Updated last year
facebookresearch / LeanUniverse
LeanUniverse: A Library for Consistent and Scalable Lean4 Dataset Management
☆69Updated 5 months ago
Datura-ai / cortex.t
☆63Updated 6 months ago
catid / lllm
Latent Large Language Models
☆18Updated 10 months ago
doomslide / autoloom
Approximating the joint distribution of language models via MCTS
☆21Updated 8 months ago
PrimeIntellect-ai / smart-contracts
Solidity contracts for the decentralized Prime Network protocol
☆23Updated this week
kyegomez / Exa
Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…
☆27Updated 8 months ago