neubig / minllama-assignmentLinks
☆89Updated 9 months ago
Alternatives and similar repositories for minllama-assignment
Users that are interested in minllama-assignment are comparing it to the libraries listed below
Sorting:
- Advanced NLP, Spring 2025 https://cmu-l3.github.io/anlp-spring2025/☆55Updated 2 months ago
- An assignment for building an NLP system from scratch.☆27Updated last year
- Notes and commented code for RLHF (PPO)☆96Updated last year
- ☆298Updated 5 months ago
- A brief and partial summary of RLHF algorithms.☆129Updated 3 months ago
- Direct Preference Optimization from scratch in PyTorch☆98Updated 2 months ago
- NeurIPS 2024 tutorial on LLM Inference☆45Updated 6 months ago
- Student version of Assignment 1 for Stanford CS336 - Language Modeling From Scratch☆174Updated 2 months ago
- ☆179Updated last year
- Website☆53Updated 2 years ago
- Minimalist BERT implementation assignment for CS11-711☆83Updated 2 years ago
- minimal GRPO implementation from scratch☆90Updated 3 months ago
- Organize the Web: Constructing Domains Enhances Pre-Training Data Curation☆55Updated last month
- Resources for cultural NLP research☆97Updated 2 months ago
- Notes on Direct Preference Optimization☆19Updated last year
- Project 2 (Building Large Language Models) for Stanford CS324: Understanding and Developing Large Language Models (Winter 2022)☆105Updated 2 years ago
- Simple and efficient pytorch-native transformer training and inference (batched)☆76Updated last year
- Code and Configs for Asynchronous RLHF: Faster and More Efficient RL for Language Models☆57Updated last month
- CS 224N Winter 2023 Default Final Project: Multitask BERT☆25Updated 2 years ago
- ☆132Updated 7 months ago
- A simplified implementation for experimenting with RLVR on GSM8K, This repository provides a starting point for exploring reasoning.☆101Updated 4 months ago
- ☆51Updated last year
- ☆117Updated 3 months ago
- ☆83Updated 5 months ago
- ☆33Updated 3 months ago
- Critique-out-Loud Reward Models☆66Updated 8 months ago
- ☆180Updated 2 months ago
- The Paper List on Data Contamination for Large Language Models Evaluation.☆95Updated 2 months ago
- A scalable asynchronous reinforcement learning implementation with in-flight weight updates.☆125Updated this week
- ☆193Updated 4 months ago