cmu-l3 / neurips2024-inference-tutorial-code
NeurIPS 2024 tutorial on LLM Inference
☆35Updated last month
Alternatives and similar repositories for neurips2024-inference-tutorial-code:
Users that are interested in neurips2024-inference-tutorial-code are comparing it to the libraries listed below
- ☆25Updated 8 months ago
- ☆54Updated 8 months ago
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆40Updated 7 months ago
- Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"☆45Updated 11 months ago
- Flow of Reasoning: Training LLMs for Divergent Problem Solving with Minimal Examples☆52Updated last week
- CodeUltraFeedback: aligning large language models to coding preferences☆67Updated 6 months ago
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format☆27Updated last year
- The repository contains code for Adaptive Data Optimization☆21Updated last month
- ☆83Updated 2 months ago
- Repository for NPHardEval, a quantified-dynamic benchmark of LLMs☆50Updated 9 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆52Updated 4 months ago
- ☆27Updated last week
- ☆92Updated 6 months ago
- ☆76Updated 2 weeks ago
- Official implementation of Bootstrapping Language Models via DPO Implicit Rewards☆41Updated 5 months ago
- ☆20Updated 3 months ago
- ☆68Updated 4 months ago
- Advantage Leftover Lunch Reinforcement Learning (A-LoL RL): Improving Language Models with Advantage-based Offline Policy Gradients☆26Updated 4 months ago
- ☆48Updated 11 months ago
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆25Updated 8 months ago
- ☆21Updated this week
- This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"☆17Updated 10 months ago
- Replicating O1 inference-time scaling laws☆70Updated last month
- Codebase for Instruction Following without Instruction Tuning☆33Updated 3 months ago
- A repository for research on medium sized language models.☆76Updated 7 months ago
- [ACL 2024] Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning☆33Updated 5 months ago
- Minimal implementation of the Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models paper (ArXiv 20232401.01335)☆29Updated 10 months ago
- ☆31Updated 6 months ago
- 🌾 OAT: Online AlignmenT for LLMs☆78Updated 2 weeks ago
- Code for paper "Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System"☆43Updated last month