naba89 / custom_hf_trainerLinks
A custom Huggingface trainer which supports logging auxiliary losses returned by your model
☆15Updated 6 months ago
Alternatives and similar repositories for custom_hf_trainer
Users that are interested in custom_hf_trainer are comparing it to the libraries listed below
Sorting:
- An Efficient LLM Fine-Tuning Factory Optimized for MoE PEFT☆133Updated 11 months ago
- [ICLR 2026] TraceRL & TraDo-8B: Revolutionizing Reinforcement Learning Framework for Diffusion Large Language Models☆423Updated 2 weeks ago
- X-LoRA: Mixture of LoRA Experts☆261Updated last year
- A collection of papers on discrete diffusion models☆168Updated 7 months ago
- paper list, tutorial, and nano code snippet for Diffusion Large Language Models.☆153Updated 3 weeks ago
- L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning☆261Updated 8 months ago
- 📖 This is a repository for organizing papers, codes, and other resources related to Latent Reasoning.☆352Updated 3 months ago
- Official PyTorch implementation for ICLR2025 paper "Scaling up Masked Diffusion Models on Text"☆364Updated last year
- State-of-the-art Parameter-Efficient MoE Fine-tuning Method☆203Updated last year
- Auto get diffusion nlp papers in Axriv. More papers Information can be found in another repository "Diffusion-LM-Papers".☆260Updated last week
- The official GitHub repo for the survey paper "A Survey on Diffusion Language Models".☆755Updated last week
- A brief and partial summary of RLHF algorithms.☆144Updated 11 months ago
- This repository contains a regularly updated paper list for LLMs-reasoning-in-latent-space.☆276Updated last week
- Implementation of 🥥 Coconut, Chain of Continuous Thought, in Pytorch☆182Updated 7 months ago
- ☆176Updated last year
- [ICLR 2026] On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification.☆532Updated last month
- [EMNLP 2025] TokenSkip: Controllable Chain-of-Thought Compression in LLMs☆201Updated 2 months ago
- The trainer for HF to record losses of different tasks and objectives.☆49Updated 10 months ago
- [ICLR2025] DiffuGPT and DiffuLLaMA: Scaling Diffusion Language Models via Adaptation from Autoregressive Models☆362Updated 8 months ago
- ☆108Updated 2 months ago
- Official Implementation for the paper "d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning"☆402Updated 2 weeks ago
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)☆154Updated 7 months ago
- ☆205Updated last month
- Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"☆246Updated 4 months ago
- AlphaEdit: Null-Space Constrained Knowledge Editing for Language Models, ICLR 2025 (Outstanding Paper)☆393Updated 3 months ago
- ☆273Updated 2 years ago
- ☆144Updated 10 months ago
- The HELMET Benchmark☆198Updated 2 months ago
- ☆352Updated 6 months ago
- AnchorAttention: Improved attention for LLMs long-context training☆213Updated last year