neubig / minllama-assignmentLinks
☆100Updated last year
Alternatives and similar repositories for minllama-assignment
Users that are interested in minllama-assignment are comparing it to the libraries listed below
Sorting:
- An assignment for building an NLP system from scratch.☆27Updated last year
- Advanced NLP, Spring 2025 https://cmu-l3.github.io/anlp-spring2025/☆71Updated 10 months ago
- ☆190Updated 2 years ago
- Notes and commented code for RLHF (PPO)☆124Updated last year
- Direct Preference Optimization from scratch in PyTorch☆126Updated 10 months ago
- ☆413Updated last year
- NeurIPS 2024 tutorial on LLM Inference☆47Updated last year
- A brief and partial summary of RLHF algorithms.☆144Updated 11 months ago
- ☆140Updated last year
- minimal GRPO implementation from scratch☆102Updated 10 months ago
- ☆105Updated 6 months ago
- ☆169Updated 4 months ago
- Resources for cultural NLP research☆113Updated 4 months ago
- Distributed training (multi-node) of a Transformer model☆94Updated last year
- Notes on Direct Preference Optimization☆24Updated last year
- Code for STaR: Bootstrapping Reasoning With Reasoning (NeurIPS 2022)☆220Updated 2 years ago
- Project 2 (Building Large Language Models) for Stanford CS324: Understanding and Developing Large Language Models (Winter 2022)☆105Updated 2 years ago
- "Improving Mathematical Reasoning with Process Supervision" by OPENAI☆114Updated last week
- The official implementation of the paper "What Matters in Transformers? Not All Attention is Needed".☆188Updated 2 months ago
- Organize the Web: Constructing Domains Enhances Pre-Training Data Curation☆77Updated 9 months ago
- ☆160Updated last year
- ☆99Updated last year
- LLM-Merging: Building LLMs Efficiently through Merging☆209Updated last year
- Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".☆224Updated last month
- The Paper List on Data Contamination for Large Language Models Evaluation.☆110Updated 2 weeks ago
- Code and Configs for Asynchronous RLHF: Faster and More Efficient RL for Language Models☆68Updated 9 months ago
- ☆82Updated last year
- Prune transformer layers☆74Updated last year
- ☆232Updated 2 months ago
- An extension of the nanoGPT repository for training small MOE models.☆236Updated 11 months ago