hkproj / bert-from-scratchLinks
BERT explained from scratch
β12Updated last year
Alternatives and similar repositories for bert-from-scratch
Users that are interested in bert-from-scratch are comparing it to the libraries listed below
Sorting:
- β36Updated 2 weeks ago
- Complete implementation of Llama2 with/without KV cache & inference πβ46Updated last year
- Distributed training (multi-node) of a Transformer modelβ68Updated last year
- β87Updated 8 months ago
- Prune transformer layersβ69Updated last year
- Unofficial implementation of https://arxiv.org/pdf/2407.14679β44Updated 9 months ago
- Notes on Direct Preference Optimizationβ19Updated last year
- A set of scripts and notebooks on LLM finetunning and dataset creationβ111Updated 8 months ago
- Notes about LLaMA 2 modelβ61Updated last year
- Truth Forest: Toward Multi-Scale Truthfulness in Large Language Models through Intervention without Tuningβ46Updated last year
- β125Updated last year
- An extension of the nanoGPT repository for training small MOE models.β147Updated 2 months ago
- This repository contains an implementation of the LLaMA 2 (Large Language Model Meta AI) model, a Generative Pretrained Transformer (GPT)β¦β67Updated last year
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absoluteβ¦β49Updated 10 months ago
- a curated list of the role of small models in the LLM eraβ100Updated 8 months ago
- Set of scripts to finetune LLMsβ37Updated last year
- Triton implementation of GPT/LLAMAβ18Updated 9 months ago
- Data preparation code for Amber 7B LLMβ91Updated last year
- ML/DL Math and Method notesβ61Updated last year
- Mixed precision training from scratch with Tensors and CUDAβ23Updated last year
- Notes on quantization in neural networksβ83Updated last year
- minimal GRPO implementation from scratchβ90Updated 2 months ago
- Collection of resources for RL and Reasoningβ25Updated 4 months ago
- Notes and commented code for RLHF (PPO)β96Updated last year
- This repository contains the code for dataset curation and finetuning of instruct variant of the Bilingual OpenHathi model. The resultinβ¦β23Updated last year
- Lightweight demos for finetuning LLMs. Powered by π€ transformers and open-source datasets.β77Updated 7 months ago
- β14Updated last month
- making the official triton tutorials actually comprehensibleβ36Updated 2 months ago
- β92Updated 2 months ago
- I learn about and explain quantizationβ26Updated last year