ByungKwanLee / DeepSick-R1
Reproduction of DeepSeek-R1
☆121Updated this week
Alternatives and similar repositories for DeepSick-R1:
Users that are interested in DeepSick-R1 are comparing it to the libraries listed below
- Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)☆150Updated 3 months ago
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024☆277Updated last month
- The official implementation of the paper "What Matters in Transformers? Not All Attention is Needed".☆165Updated this week
- Survey: A collection of AWESOME papers and resources on the latest research in Mixture of Experts.☆108Updated 7 months ago
- Implementation of 🥥 Coconut, Chain of Continuous Thought, in Pytorch☆161Updated 2 months ago
- Official implementation of "DoRA: Weight-Decomposed Low-Rank Adaptation"☆123Updated 11 months ago
- minimal GRPO implementation from scratch☆62Updated 2 weeks ago
- LLM-Merging: Building LLMs Efficiently through Merging☆192Updated 6 months ago
- L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning☆162Updated last week
- A brief and partial summary of RLHF algorithms.☆127Updated 3 weeks ago
- ☆158Updated last month
- MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning☆354Updated 7 months ago
- A continually updated list of literature on Reinforcement Learning from AI Feedback (RLAIF)☆158Updated 2 months ago
- Official implementation of paper: SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training☆252Updated last month
- An extension of the nanoGPT repository for training small MOE models.☆106Updated 2 weeks ago
- Python Library to evaluate VLM models' robustness across diverse benchmarks☆195Updated last week
- nanoGRPO is a lightweight implementation of Group Relative Policy Optimization (GRPO)☆91Updated this week
- A minimal implementation of LLaVA-style VLM with interleaved image & text & video processing ability.☆90Updated 3 months ago
- Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind☆174Updated 6 months ago
- Implementation of Infini-Transformer in Pytorch☆110Updated 2 months ago
- [NeurIPS 2024] Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Models☆207Updated 3 weeks ago
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆310Updated 3 months ago
- ☆83Updated 2 weeks ago
- ☆173Updated 3 months ago
- [Under Review] Official PyTorch implementation code for realizing the technical part of Phantom of Latent representing equipped with enla…☆55Updated 5 months ago
- Chain of Experts (CoE) enables communication between experts within Mixture-of-Experts (MoE) models☆145Updated 2 weeks ago
- Unofficial implementation for the paper "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆154Updated 9 months ago
- Efficient Infinite Context Transformers with Infini-attention Pytorch Implementation + QwenMoE Implementation + Training Script + 1M cont…☆80Updated 10 months ago
- PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"☆163Updated 2 months ago
- [ACL 2024 Findings & ICLR 2024 WS] An Evaluator VLM that is open-source, offers reproducible evaluation, and inexpensive to use. Specific…☆66Updated 6 months ago