veekaybee / what_are_embeddingsView external linksLinks
A deep dive into embeddings starting from fundamentals
☆1,057Jan 17, 2026Updated 3 weeks ago
Alternatives and similar repositories for what_are_embeddings
Users that are interested in what_are_embeddings are comparing it to the libraries listed below
Sorting:
- Good books, good vibes☆431Jan 6, 2024Updated 2 years ago
- Machine Learning Engineering Open Book☆16,675Updated this week
- Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.☆10,309Jul 1, 2024Updated last year
- DSPy: The framework for programming—not prompting—language models☆32,156Updated this week
- Structured Outputs☆13,403Feb 6, 2026Updated last week
- Toolkit to forge scikit-learn compatible estimators☆19Feb 1, 2026Updated 2 weeks ago
- A guidance language for controlling large language models.☆21,270Feb 6, 2026Updated last week
- just a bunch of useful embeddings for scikit-learn pipelines☆521Sep 29, 2025Updated 4 months ago
- The balance python package offers a simple workflow and methods for dealing with biased data samples when looking to infer from them to s…☆737Updated this week
- Go ahead and axolotl questions☆11,289Updated this week
- structured outputs for llms☆12,357Updated this week
- A tiny nearest-neighbor embedding database built with SQLite and Pytorch. (In development!)☆772Jul 12, 2023Updated 2 years ago
- 🤖 A PyTorch library of curated Transformer models and their composable components☆894Apr 17, 2024Updated last year
- The release of the Twitter algorithm, annotated for recsys☆494Apr 15, 2023Updated 2 years ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆52,955Nov 12, 2025Updated 3 months ago
- LlamaIndex is the leading framework for building LLM-powered agents over your data.☆46,977Updated this week
- Projects completed under LinuxWorld Informatics Ltd. - MLOps Training.☆12Aug 15, 2020Updated 5 years ago
- Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-…☆3,852May 17, 2025Updated 8 months ago
- Implement a ChatGPT-like LLM in PyTorch from scratch, step by step☆85,210Updated this week
- Explanation to key concepts in ML☆8,513Jun 30, 2025Updated 7 months ago
- Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.☆74,834Feb 5, 2026Updated last week
- LLM101n: Let's build a Storyteller☆36,281Aug 1, 2024Updated last year
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.☆13,155Feb 8, 2026Updated last week
- Lightning ⚡️ fast forecasting with statistical and econometric models.☆4,687Updated this week
- A reactive notebook for Python — run reproducible experiments, query with SQL, execute as a script, deploy as an app, and version with gi…☆19,005Updated this week
- llama3 implementation one matrix multiplication at a time☆15,239May 23, 2024Updated last year
- Hackers' Guide to Language Models☆1,864Dec 13, 2024Updated last year
- Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.☆6,183Aug 22, 2025Updated 5 months ago
- A Bulletproof Way to Generate Structured JSON from Language Models☆4,902Feb 24, 2024Updated last year
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.☆1,594Dec 20, 2025Updated last month
- Creating beautiful plots of data maps☆975Updated this week
- Numbers every LLM developer should know☆4,282Jan 16, 2024Updated 2 years ago
- Understanding Deep Learning - Simon J.D. Prince☆9,071Updated this week
- Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets☆4,852Updated this week
- It's a cooler way to store simple linear models.☆27Jul 15, 2024Updated last year
- Some microbenchmarks and design docs before commencement☆12Feb 1, 2021Updated 5 years ago
- data cleaning and curation for unstructured text☆328Aug 6, 2024Updated last year
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.☆2,885Updated this week
- A playbook for systematically maximizing the performance of deep learning models.☆29,798Jun 18, 2024Updated last year