yuandong-tian / arXiv_recbot
A Telegram bot to recommend arXiv papers
☆219Updated last week
Alternatives and similar repositories for arXiv_recbot:
Users that are interested in arXiv_recbot are comparing it to the libraries listed below
- Super-Efficient RLHF Training of LLMs with Parameter Reallocation☆195Updated this week
- The related works and background techniques about Openai o1☆192Updated last week
- The official GitHub page for the survey paper "A Survey on Mixture of Experts in Large Language Models".☆200Updated this week
- A series of technical report on Slow Thinking with LLM☆297Updated last week
- Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"☆329Updated 6 months ago
- ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)☆521Updated 2 weeks ago
- A visuailzation tool to make deep understaning and easier debugging for RLHF training.☆90Updated last week
- 📰 Must-read papers on KV Cache Compression (constantly updating 🤗).☆256Updated this week
- Cool Papers - Immersive Paper Discovery☆447Updated last month
- A brief and partial summary of RLHF algorithms.☆89Updated last month
- Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey☆274Updated this week
- AI Alignment: A Comprehensive Survey☆133Updated last year
- veRL: Volcano Engine Reinforcement Learning for LLM☆690Updated this week
- This is the repository that contains the source code for the Self-Evaluation Guided MCTS for online DPO.☆272Updated 5 months ago
- ☆169Updated 2 months ago
- Curation of resources for LLM research, screened by @tongyx361 to ensure high quality and accompanied with elaborately-written concise de…☆45Updated 6 months ago
- Paper collections of the continuous effort start from World Models.☆161Updated 6 months ago
- open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality☆170Updated 5 months ago
- Repo of paper "Free Process Rewards without Process Labels"☆94Updated this week
- My learning notes/codes for ML SYS.☆367Updated this week
- A collection of phenomenons observed during the scaling of big foundation models, which may be developed into consensus, principles, or l…☆276Updated last year
- Code for Paper (ReMax: A Simple, Efficient and Effective Reinforcement Learning Method for Aligning Large Language Models)☆167Updated last year
- Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"☆145Updated last month
- Reference implementation for Token-level Direct Preference Optimization(TDPO)☆124Updated 6 months ago
- ☆185Updated last year
- A generalized framework for subspace tuning methods in parameter efficient fine-tuning.☆120Updated last week
- awesome papers in LLM interpretability☆378Updated this week
- ☆317Updated 6 months ago
- OpenReivew Submission Visualization (ICLR 2024/2025)☆148Updated 3 months ago
- Align Anything: Training All-modality Model with Feedback☆422Updated last week