okhat / blog
☆271Updated 4 months ago
Alternatives and similar repositories for blog:
Users that are interested in blog are comparing it to the libraries listed below
- A bibliography and survey of the papers surrounding o1☆1,155Updated 3 months ago
- A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..☆201Updated 4 months ago
- open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality☆172Updated 6 months ago
- This repository collects all relevant resources about interpretability in LLMs☆321Updated 3 months ago
- Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"☆289Updated 3 months ago
- GPT4 based personalized ArXiv paper assistant bot☆506Updated 10 months ago
- RewardBench: the first evaluation tool for reward models.☆505Updated this week
- A brief and partial summary of RLHF algorithms.☆93Updated 2 months ago
- [ICML 2024] CLLMs: Consistency Large Language Models☆372Updated 3 months ago
- A curated collection of LLM reasoning and planning resources, including key papers, limitations, benchmarks, and additional learning mate…☆229Updated 5 months ago
- System 2 Reasoning Link Collection☆794Updated 2 weeks ago
- A platform for developers to simulate collaborative research activities☆138Updated this week
- Code for the paper 🌳 Tree Search for Language Model Agents☆178Updated 6 months ago
- What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective☆59Updated 3 months ago
- Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.☆104Updated 3 weeks ago
- Training Large Language Model to Reason in a Continuous Latent Space☆877Updated 3 weeks ago
- Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities. arXiv:2408.07666.☆316Updated this week
- 🌾 OAT: A research-friendly framework for LLM online alignment, including preference learning, reinforcement learning, etc.☆194Updated last week
- Sparsify transformers with SAEs and transcoders☆461Updated this week
- Code and example data for the paper: Rule Based Rewards for Language Model Safety☆178Updated 7 months ago
- A simple unified framework for evaluating LLMs☆197Updated 2 weeks ago
- A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).☆803Updated last week
- Representation Engineering: A Top-Down Approach to AI Transparency☆789Updated 6 months ago
- ☆421Updated 7 months ago
- ☆151Updated this week
- Codebase for reproducing the experiments of the semantic uncertainty paper (short-phrase and sentence-length experiments).☆280Updated 10 months ago
- ☆214Updated 2 weeks ago
- AnchorAttention: Improved attention for LLMs long-context training☆205Updated last month
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆297Updated 2 months ago