yushuiwx / Mixture-of-LoRA-Experts
☆21Updated 2 months ago
Alternatives and similar repositories for Mixture-of-LoRA-Experts:
Users that are interested in Mixture-of-LoRA-Experts are comparing it to the libraries listed below
- ☆123Updated 6 months ago
- Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".☆52Updated 2 months ago
- The official code repository for PRMBench.☆64Updated this week
- Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"☆348Updated last month
- [EMNLP 2024] Source code for the paper "Learning Planning-based Reasoning with Trajectory Collection and Process Rewards Synthesizing".☆68Updated last month
- [NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct☆154Updated last month
- ☆61Updated 8 months ago
- This is the repository that contains the source code for the Self-Evaluation Guided MCTS for online DPO.☆289Updated 6 months ago
- ☆64Updated 10 months ago
- Repository for Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning☆161Updated last year
- LoRAMoE: Revolutionizing Mixture of Experts for Maintaining World Knowledge in Language Model Alignment☆284Updated 9 months ago
- [NeurIPS 2024] Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"☆109Updated 11 months ago
- A RLHF Infrastructure for Vision-Language Models☆162Updated 3 months ago
- This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"☆29Updated 7 months ago
- [EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.☆63Updated 3 months ago
- ☆163Updated 7 months ago
- [ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibrati…☆31Updated 7 months ago
- Curation of resources for LLM mathematical reasoning, most of which are screened by @tongyx361 to ensure high quality and accompanied wit…☆114Updated 7 months ago
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.☆95Updated 4 months ago
- 📜 Paper list on decoding methods for LLMs and LVLMs☆20Updated last month
- M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning☆55Updated last month
- [SIGIR'24] The official implementation code of MOELoRA.☆145Updated 6 months ago
- TRACE: A Comprehensive Benchmark for Continual Learning in Large Language Models☆64Updated last year
- Reference implementation for Token-level Direct Preference Optimization(TDPO)☆127Updated last week
- [ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization☆66Updated 6 months ago
- State-of-the-art Parameter-Efficient MoE Fine-tuning Method☆130Updated 5 months ago
- my commonly-used tools☆50Updated last month
- Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"☆154Updated 2 months ago