hkproj / dpo-notes
Notes on Direct Preference Optimization
☆12Updated 9 months ago
Alternatives and similar repositories for dpo-notes:
Users that are interested in dpo-notes are comparing it to the libraries listed below
- A framework to study AI models in Reasoning, Alignment, and use of Memory (RAM).☆157Updated last week
- Open Implementations of LLM Analyses☆98Updated 3 months ago
- Notes and commented code for RLHF (PPO)☆51Updated 10 months ago
- Distributed training (multi-node) of a Transformer model☆49Updated 9 months ago
- A curated list of awesome LLM Inference-Time Self-Improvement (ITSI, pronounced "itsy") papers from our recent survey: A Survey on Large …☆64Updated 3 weeks ago
- A curated list of papers related to constrained decoding of LLM, along with their relevant code and resources.☆126Updated last month
- ☆98Updated last month
- ReBase: Training Task Experts through Retrieval Based Distillation☆28Updated 6 months ago
- Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".☆153Updated last month
- ☆65Updated 6 months ago
- ☆40Updated 8 months ago
- A pipeline for LLM knowledge distillation☆83Updated 5 months ago
- DSBench: How Far are Data Science Agents from Becoming Data Science Experts?☆40Updated 3 months ago
- ☆28Updated 5 months ago
- Code for NeurIPS LLM Efficiency Challenge☆54Updated 9 months ago
- Unofficial implementation for the paper "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆145Updated 7 months ago
- ☆48Updated 11 months ago
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"☆77Updated 3 months ago
- a curated list of the role of small models in the LLM era☆89Updated 3 months ago
- LongRecipe: Recipe for Efficient Long Context Generalization in Large Language Models☆73Updated 3 months ago
- ☆87Updated last month
- Code for EMNLP 2024 paper "Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning"☆50Updated 3 months ago
- augmented LLM with self reflection☆109Updated last year
- Survey: A collection of AWESOME papers and resources on the latest research in Mixture of Experts.☆98Updated 5 months ago
- Official implementation of paper "On the Diagram of Thought" (https://arxiv.org/abs/2409.10038)☆173Updated 4 months ago
- "Enhancing LLM Factual Accuracy with RAG to Counter Hallucinations: A Case Study on Domain-Specific Queries in Private Knowledge-Bases" b…☆39Updated 10 months ago
- Cascade Speculative Drafting☆28Updated 9 months ago
- "Improving Mathematical Reasoning with Process Supervision" by OPENAI☆101Updated last week
- A highly capable 2.4B lightweight LLM using only 1T pre-training data with all details.☆133Updated this week
- Data preparation code for CrystalCoder 7B LLM☆44Updated 8 months ago