GPT-Alternatives / gpt_alternativesLinks
☆75Updated last year
Alternatives and similar repositories for gpt_alternatives
Users that are interested in gpt_alternatives are comparing it to the libraries listed below
Sorting:
- Official completion of “Training on the Benchmark Is Not All You Need”.☆34Updated 6 months ago
- LongQLoRA: Extent Context Length of LLMs Efficiently☆166Updated last year
- A MoE impl for PyTorch, [ATC'23] SmartMoE☆64Updated 2 years ago
- SOTA Math Opensource LLM☆333Updated last year
- This is the official implementation of "Progressive-Hint Prompting Improves Reasoning in Large Language Models"☆209Updated last year
- Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models☆135Updated last year
- code for Scaling Laws of RoPE-based Extrapolation☆73Updated last year
- This is a personal reimplementation of Google's Infini-transformer, utilizing a small 2b model. The project includes both model and train…☆58Updated last year
- ☆106Updated last year
- Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning☆187Updated 3 months ago
- ☆144Updated last year
- AI Alignment: A Comprehensive Survey☆135Updated last year
- SUS-Chat: Instruction tuning done right☆48Updated last year
- ☆82Updated last year
- ☆147Updated 5 months ago
- Mixture-of-Experts (MoE) Language Model☆189Updated 10 months ago
- Official implementation of paper "Cumulative Reasoning With Large Language Models" (https://arxiv.org/abs/2308.04371)☆294Updated 10 months ago
- ☆94Updated 7 months ago
- a Fine-tuned LLaMA that is Good at Arithmetic Tasks☆178Updated last year
- A visuailzation tool to make deep understaning and easier debugging for RLHF training.☆228Updated 4 months ago
- ICML2025: Forest-of-Thought: Scaling Test-Time Compute for Enhancing LLM Reasoning☆43Updated 2 months ago
- Counting-Stars (★)☆83Updated last month
- Official implementation of TransNormerLLM: A Faster and Better LLM☆247Updated last year
- ☆40Updated last year
- Unleashing the Power of Cognitive Dynamics on Large Language Models☆62Updated 9 months ago
- [ICML'24] The official implementation of “Rethinking Optimization and Architecture for Tiny Language Models”☆122Updated 6 months ago
- Reformatted Alignment☆113Updated 9 months ago
- A highly capable 2.4B lightweight LLM using only 1T pre-training data with all details.☆195Updated last week
- Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters☆90Updated 2 years ago
- Delta-CoMe can achieve near loss-less 1-bit compressin which has been accepted by NeurIPS 2024☆57Updated 8 months ago