clabrugere / scratch-llmLinks
Implements a LLM similar to Meta's Llama 2 from the ground up in PyTorch, for educational purposes.
☆37Updated 9 months ago
Alternatives and similar repositories for scratch-llm
Users that are interested in scratch-llm are comparing it to the libraries listed below
Sorting:
- Gemma2(9B), Llama3-8B-Finetune-and-RAG, code base for sample, implemented in Kaggle platform☆22Updated 9 months ago
- Manages vllm-nccl dependency☆17Updated last year
- Playground for Transformers☆53Updated last year
- Benchmarking PyTorch 2.0 different models☆20Updated 2 years ago
- Fast and memory-efficient exact attention ported to rocm☆11Updated last year
- ☆17Updated last year
- Tutorial for LLM developers about engine design, service deployment, evaluation/benchmark, etc. Provide a C/S style optimized LLM inferen…☆19Updated 2 years ago
- minimal scripts for 24GB VRAM GPUs. training, inference, whatever☆50Updated 2 weeks ago
- Implementation of transformers based architecture in PyTorch.☆54Updated 4 years ago
- Multi-Layer Key-Value sharing experiments on Pythia models☆34Updated last year
- Create a source of truth for ML model results and browse it on Papers with Code☆33Updated 4 years ago
- Advanced implementation of DeepSeek-R1 featuring Group Relative Policy Optimization (GRPO) for mathematical reasoning AI. Integrates safe…☆13Updated 10 months ago
- several types of attention modules written in PyTorch for learning purposes☆52Updated last year
- Context Manager to profile the forward and backward times of PyTorch's nn.Module☆83Updated 2 years ago
- Experimental scripts for researching data adaptive learning rate scheduling.☆22Updated 2 years ago
- Fine-tuning an LLM using a Generic Workflow and Best Practices with PyTorch☆27Updated 2 years ago
- Implementation of the paper: "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" from Google in pyTO…☆57Updated this week
- Visual similarity search engine demo with use of PyTorch Metric Learning and Qdrant☆12Updated 2 years ago
- Library for the Test-based Calibration Error (TCE) metric to quantify the degree to classifier calibration.☆13Updated 2 years ago
- Microsoft Phi 2 Streamlit App, deployed on HuggingFace Spaces is based on the Microsoft Phi 2 small language model (SLM) for text generat…☆14Updated last year
- Benchmark for machine learning model online serving (LLM, embedding, Stable-Diffusion, Whisper)☆28Updated 2 years ago
- Make triton easier☆49Updated last year
- A tiny package supporting distributed computation of COCO metrics for PyTorch models.☆15Updated 2 years ago
- ☆31Updated last year
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…☆31Updated 2 years ago
- code for paper "Accessing higher dimensions for unsupervised word translation"☆22Updated 2 years ago
- ☆78Updated last year
- Intel Gaudi's Megatron DeepSpeed Large Language Models for training☆15Updated 11 months ago
- a curated list of the role of small models in the LLM era☆109Updated last year
- OLMost every training recipe you need to perform data interventions with the OLMo family of models.☆56Updated this week