stair-lab / mlhpLinks
Machine Learning from Human Preferences
☆26Updated last week
Alternatives and similar repositories for mlhp
Users that are interested in mlhp are comparing it to the libraries listed below
Sorting:
- [ICLR 2025] This repository contains the code to reproduce the results from our paper From Sparse Dependence to Sparse Attention: Unveili…☆12Updated 11 months ago
- Official codebase for "Analyzing the Generalization and Reliability of Steering Vectors"☆19Updated last year
- ☆23Updated last year
- The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…☆128Updated 4 months ago
- Code for I-RAVEN-X generation and experiments☆19Updated 4 months ago
- Physics of Language Models: Part 4.2, Canon Layers at Scale where Synthetic Pretraining Resonates in Reality☆317Updated last month
- ☆33Updated last year
- Official repo of dataset-decomposition paper [NeurIPS 2024]☆21Updated last year
- ☆28Updated 2 months ago
- Efficiently discovering algorithms via LLMs with evolutionary search and reinforcement learning.☆130Updated 2 months ago
- Package of Pathways-on-Cloud utilities☆25Updated this week
- Common tools for data processing☆22Updated 2 months ago
- [NeurIPS 2024] Goldfish Loss: Mitigating Memorization in Generative LLMs☆94Updated last year
- Official repo of paper LM2☆46Updated 11 months ago
- Stanford NLP Python library for benchmarking the utility of LLM interpretability methods☆163Updated 7 months ago
- This repo contains the source code for the paper "Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning"☆292Updated 2 months ago
- Official implementation of Recurrent Action Transformer with Memory, an offline RL agent with memory mechanisms. https://sites.google.com…☆18Updated 2 months ago
- Superposition Yields Robust Neural Scaling☆54Updated this week
- Universal Neurons in GPT2 Language Models☆30Updated last year
- Defeating the Training-Inference Mismatch via FP16☆182Updated 2 months ago
- $100K or 100 Days: Trade-offs when Pre-Training with Academic Resources☆150Updated 4 months ago
- Landing repository for the paper "Softpick: No Attention Sink, No Massive Activations with Rectified Softmax"☆87Updated 5 months ago
- ☆69Updated 10 months ago
- NeurIPS 2024 tutorial on LLM Inference☆47Updated last year
- ☆108Updated last year
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆187Updated 3 weeks ago
- ☆136Updated 2 months ago
- Brax + Pufferlib + CARBS for gpu-accelerated robotics RL☆12Updated 8 months ago
- AIRA-dojo: a framework for developing and evaluating AI research agents☆127Updated 3 weeks ago
- Implementation of PatchSAE as presented in "Sparse autoencoders reveal selective remapping of visual concepts during adaptation"☆30Updated 3 months ago