okhat / blog
☆273Updated 6 months ago
Alternatives and similar repositories for blog:
Users that are interested in blog are comparing it to the libraries listed below
- Understanding R1-Zero-Like Training: A Critical Perspective☆725Updated this week
- A bibliography and survey of the papers surrounding o1☆1,183Updated 4 months ago
- A Survey on Efficient Reasoning for LLMs☆204Updated this week
- Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"☆300Updated 4 months ago
- Function Vectors in Large Language Models (ICLR 2024)☆153Updated 2 weeks ago
- Training Large Language Model to Reason in a Continuous Latent Space☆1,015Updated 2 months ago
- A brief and partial summary of RLHF algorithms.☆127Updated 3 weeks ago
- Paper list for Efficient Reasoning.☆331Updated this week
- [NeurIPS 2024] Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"☆138Updated 3 weeks ago
- [ICML 2024] CLLMs: Consistency Large Language Models☆388Updated 4 months ago
- A curated collection of LLM reasoning and planning resources, including key papers, limitations, benchmarks, and additional learning mate…☆249Updated last month
- Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities. arXiv:2408.07666.☆347Updated this week
- open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality☆179Updated 7 months ago
- A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..☆213Updated last week
- Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models☆439Updated this week
- AnchorAttention: Improved attention for LLMs long-context training☆206Updated 2 months ago
- RewardBench: the first evaluation tool for reward models.☆532Updated last month
- Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.☆405Updated 11 months ago
- [ACM Computing Surveys 2025] This repository collects awesome survey, resource, and paper for Lifelong Learning with Large Language Model…☆115Updated last month
- ☆262Updated 2 weeks ago
- [NeurIPS 2024] Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Models☆209Updated 3 weeks ago
- 🌾 OAT: A research-friendly framework for LLM online alignment, including preference learning, reinforcement learning, etc.☆300Updated last week
- Some preliminary explorations of Mamba's context scaling.☆212Updated last year
- This repository collects all relevant resources about interpretability in LLMs☆328Updated 5 months ago
- [NeurIPS 2024] Knowledge Circuits in Pretrained Transformers☆135Updated last month
- Build your own visual reasoning model☆320Updated this week
- Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.☆124Updated 2 months ago
- Paper List of Inference/Test Time Scaling/Computing☆131Updated this week
- ☆445Updated 8 months ago
- L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning☆162Updated 2 weeks ago