shivendrra / SmallLanguageModelLinks
a LLM cookbook, for building your own from scratch, all the way from gathering data to training a model
☆167Updated last year
Alternatives and similar repositories for SmallLanguageModel
Users that are interested in SmallLanguageModel are comparing it to the libraries listed below
Sorting:
- Solving data for LLMs - Create quality synthetic datasets!☆151Updated 11 months ago
- Build a Streamlit Chatbot using Langchain, ColBERT, Ragatouille, and ChromaDB☆123Updated last year
- ☆87Updated last year
- ☆75Updated last year
- Simple examples using Argilla tools to build AI☆57Updated last year
- An automated tool for discovering insights from research papaer corpora☆138Updated last year
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆91Updated 11 months ago
- ☆55Updated 4 months ago
- Function Calling Benchmark & Testing☆92Updated last year
- Finetune Llama-3-8b on the MathInstruct dataset☆116Updated last year
- ☆119Updated last year
- ☆101Updated last year
- A simple MLX implementation for pretraining LLMs on Apple Silicon.☆85Updated 4 months ago
- Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detectio…☆85Updated last year
- ☆127Updated 9 months ago
- Following Karpathy with GPT-2 implementation and training, writing lots of comments cause I have memory of a goldfish☆172Updated last year
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free☆233Updated last year
- Examples of RAG using Llamaindex with local LLMs - Gemma, Mixtral 8x7B, Llama 2, Mistral 7B, Orca 2, Phi-2, Neural 7B☆131Updated last year
- ☆86Updated last year
- A repository of Python scripts to scrape code contents of the public repositories of `huggingface`.☆53Updated last year
- a tiny vectorstore implementation built with numpy.☆63Updated last year
- Video+code lecture on building nanoGPT from scratch☆68Updated last year
- Experiments with open source LLMs☆74Updated 2 weeks ago
- An overview of GRPO & DeepSeek-R1 Training with Open Source GRPO Model Fine Tuning☆37Updated 7 months ago
- LLaMA 3 is one of the most promising open-source model after Mistral, we will recreate it's architecture in a simpler manner.☆196Updated last year
- A comprehensive deep dive into the world of tokens☆227Updated last year
- ☆61Updated last year
- A reimplementation of langgraph's customer support example in Rasa's CALM paradigm and a quantiative evaluation of the 2 approaches☆80Updated 9 months ago
- ☆88Updated 2 years ago
- Learn the building blocks of how to build gpt-oss from scratch☆110Updated 3 months ago