paulosantosneto / unofficial-cot-decoding
unofficial implementation of the CoT-decoding method for extract cot paths in an unsupervised way
☆22Updated 6 months ago
Alternatives and similar repositories for unofficial-cot-decoding:
Users that are interested in unofficial-cot-decoding are comparing it to the libraries listed below
- Code for Paper: Teaching Language Models to Critique via Reinforcement Learning☆94Updated last week
- ☆95Updated last month
- Unofficial Implementation of Chain-of-Thought Reasoning Without Prompting☆32Updated last year
- A Comprehensive Benchmark for Software Development.☆103Updated 10 months ago
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆68Updated last month
- Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling☆101Updated 3 months ago
- augmented LLM with self reflection☆120Updated last year
- Co-LLM: Learning to Decode Collaboratively with Multiple Language Models☆112Updated 11 months ago
- [ACL2024] Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Complex Scenarios☆56Updated last year
- "Improving Mathematical Reasoning with Process Supervision" by OPENAI☆108Updated 2 weeks ago
- A curated paper list on LLM reasoning.☆86Updated last year
- Offical Repo for "Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale"☆237Updated last week
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆142Updated 7 months ago
- ☆94Updated last month
- Astraios: Parameter-Efficient Instruction Tuning Code Language Models☆57Updated last year
- Scalable Meta-Evaluation of LLMs as Evaluators☆42Updated last year
- What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective☆63Updated last month
- Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆86Updated last month
- ☆121Updated 10 months ago
- Official github repo for AutoDetect, an automated weakness detection framework for LLMs.☆42Updated 10 months ago
- NaturalCodeBench (Findings of ACL 2024)☆63Updated 6 months ago
- Benchmarking LLMs with Challenging Tasks from Real Users☆221Updated 5 months ago
- Code implementation of synthetic continued pretraining☆104Updated 3 months ago
- Code for "Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free"