shirley-wu / cot_decodingLinks
☆45Updated last year
Alternatives and similar repositories for cot_decoding
Users that are interested in cot_decoding are comparing it to the libraries listed below
Sorting:
- Spherical Merge Pytorch/HF format Language Models with minimal feature loss.☆123Updated last year
- ☆53Updated last year
- entropix style sampling + GUI☆26Updated 7 months ago
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆139Updated 3 months ago
- [ACL 2025] An inference-time decoding strategy with adaptive foresight sampling☆93Updated 2 weeks ago
- Easy to use, High Performant Knowledge Distillation for LLMs☆85Updated 3 weeks ago
- Load multiple LoRA modules simultaneously and automatically switch the appropriate combination of LoRA modules to generate the best answe…☆151Updated last year
- ☆118Updated 9 months ago
- ☆45Updated last year
- Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆90Updated 2 months ago
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆69Updated 2 weeks ago
- ☆34Updated 11 months ago
- ☆49Updated 6 months ago
- 1.58-bit LLaMa model☆81Updated last year
- Code for RATIONALYST: Pre-training Process-Supervision for Improving Reasoning https://arxiv.org/pdf/2410.01044☆33Updated 8 months ago
- How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training☆34Updated last month
- Verifiers for LLM Reinforcement Learning☆55Updated last month
- Small and Efficient Mathematical Reasoning LLMs☆71Updated last year
- The reproduct of the paper - Aligner: Achieving Efficient Alignment through Weak-to-Strong Correction☆22Updated last year
- Low-Rank adapter extraction for fine-tuned transformers models☆171Updated last year
- Data preparation code for CrystalCoder 7B LLM☆44Updated last year
- Evaluating LLMs with CommonGen-Lite☆90Updated last year
- Train your own SOTA deductive reasoning model☆92Updated 2 months ago
- Simple GRPO scripts and configurations.☆58Updated 3 months ago
- GPT-4 Level Conversational QA Trained In a Few Hours☆61Updated 9 months ago
- Data preparation code for Amber 7B LLM☆90Updated last year
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆90Updated 4 months ago
- ☆51Updated 7 months ago
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆143Updated 8 months ago
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated last year