Mayankpratapsingh022 / DeepSeek-from-ScratchLinks
☆53Updated 3 months ago
Alternatives and similar repositories for DeepSeek-from-Scratch
Users that are interested in DeepSeek-from-Scratch are comparing it to the libraries listed below
Sorting:
- Inference, Fine Tuning and many more recipes with Gemma family of models☆271Updated 3 months ago
- Verifiers for LLM Reinforcement Learning☆75Updated last month
- The code repository of the paper: Competition and Attraction Improve Model Fusion☆161Updated last month
- Real-Time Detection of Hallucinated Entities in Long-Form Generation☆258Updated last month
- frozen-in-time version of our Paper Finder agent for reproducing evaluation results☆189Updated last month
- ☆181Updated 8 months ago
- An overview of GRPO & DeepSeek-R1 Training with Open Source GRPO Model Fine Tuning☆37Updated 5 months ago
- Code to accompany the Universal Deep Research paper (https://arxiv.org/abs/2509.00244)☆443Updated last month
- ☆300Updated 2 months ago
- ☆157Updated 6 months ago
- Coding an LLM and its building blocks from scratch.☆96Updated 6 months ago
- ☆95Updated 6 months ago
- AnyModal is a Flexible Multimodal Language Model Framework for PyTorch☆103Updated 9 months ago
- 📓 A collection of generative AI open-source repositories that are actively being developed. If you are looking to build a solid profile …☆81Updated last week
- Collection of scripts and notebooks for OpenAI's latest GPT OSS models☆459Updated last month
- "LLM from Zero to Hero: An End-to-End Large Language Model Journey from Data to Application!"☆134Updated last week
- Luth is a state-of-the-art series of fine-tuned LLMs for French☆34Updated this week
- An Automatic Prompt Optimization Framework for Large Language Models☆126Updated 2 months ago
- Train LLM on Hugging Face infra☆64Updated last month
- Anemoi: A Semi-Centralized Multi-agent Systems Based on Agent-to-Agent Communication MCP server from Coral Protocol☆367Updated last month
- purpose of this repo is to Implement LLMOPs as shared in Deeplearning AI course☆35Updated last week
- Learn the building blocks of how to build gpt-oss from scratch☆88Updated 3 weeks ago
- Simple & Scalable Pretraining for Neural Architecture Research☆296Updated last month
- Simple examples using Argilla tools to build AI☆56Updated 11 months ago
- Solving data for LLMs - Create quality synthetic datasets!☆151Updated 8 months ago
- Measuring Thinking Efficiency in Reasoning Models - Research Repository☆37Updated 2 weeks ago
- Context Engineering Course with DSPy☆194Updated 2 months ago
- ☆45Updated 2 months ago
- Implementation of a GPT-4o like Multimodal from Scratch using Python☆72Updated 6 months ago
- Collection of impressive LLM apps with a focus on the financial sector☆136Updated last month