stanford-cs336 / spring2024-lectures
☆162Updated 2 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for spring2024-lectures
- ☆64Updated last month
- RuLES: a benchmark for evaluating rule-following in language models☆211Updated last month
- A bibliography and survey of the papers surrounding o1☆780Updated this week
- A brief and partial summary of RLHF algorithms.☆64Updated this week
- The official evaluation suite and dynamic data release for MixEval.☆224Updated last week
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day☆252Updated last year
- LLM-Merging: Building LLMs Efficiently through Merging☆176Updated last month
- Building blocks for foundation models.☆397Updated 10 months ago
- ☆149Updated 6 months ago
- LOFT: A 1 Million+ Token Long-Context Benchmark☆146Updated 3 weeks ago
- [ACL'24] Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning☆339Updated 2 months ago
- Can Language Models Solve Olympiad Programming?☆101Updated 3 months ago
- ☆115Updated 4 months ago
- Code for STaR: Bootstrapping Reasoning With Reasoning (NeurIPS 2022)☆165Updated last year
- LoRA and DoRA from Scratch Implementations☆188Updated 8 months ago
- ☆322Updated 4 months ago
- A banchmark list for evaluation of large language models.☆68Updated 4 months ago
- 🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash…☆193Updated this week
- Implementation of paper Data Engineering for Scaling Language Models to 128K Context☆438Updated 8 months ago
- Benchmarking LLMs with Challenging Tasks from Real Users☆195Updated 2 weeks ago
- Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"☆190Updated this week
- open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality☆160Updated 3 months ago
- RewardBench: the first evaluation tool for reward models.☆436Updated 3 weeks ago
- A framework to study AI models in Reasoning, Alignment, and use of Memory (RAM).☆147Updated this week
- Code and example data for the paper: Rule Based Rewards for Language Model Safety☆158Updated 4 months ago
- Language models scale reliably with over-training and on downstream tasks☆94Updated 7 months ago
- ☆50Updated 4 months ago
- RLHF implementation details of OAI's 2019 codebase☆153Updated 10 months ago
- ☆90Updated 4 months ago
- Textbook on reinforcement learning from human feedback☆76Updated 3 weeks ago