marta1994 / efficient_bpe_explanationLinks
This repository provides a clear, educational implementation of Byte Pair Encoding (BPE) tokenization in plain Python. The focus is on algorithmic understanding, not raw performance.
☆10Updated 11 months ago
Alternatives and similar repositories for efficient_bpe_explanation
Users that are interested in efficient_bpe_explanation are comparing it to the libraries listed below
Sorting:
- Prune transformer layers☆69Updated last year
- Best practices for distilling large language models.☆569Updated last year
- LLM Workshop by Sourab Mangrulkar☆391Updated last year
- Toolkit for attaching, training, saving and loading of new heads for transformer models☆284Updated 5 months ago
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day☆256Updated last year
- Llama from scratch, or How to implement a paper without crying☆574Updated last year
- A curated list of papers related to constrained decoding of LLM, along with their relevant code and resources.☆243Updated 3 weeks ago
- Best practices & guides on how to write distributed pytorch training code☆464Updated 5 months ago
- The repository for the code of the UltraFastBERT paper☆517Updated last year
- A set of scripts and notebooks on LLM finetunning and dataset creation☆110Updated 10 months ago
- 🦖 X—LLM: Cutting Edge & Easy LLM Finetuning☆403Updated last year
- Chat Templates for 🤗 HuggingFace Large Language Models☆691Updated 8 months ago
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024☆326Updated 3 months ago
- Interpretability for sequence generation models 🐛 🔍☆432Updated 3 months ago
- This repository contains the code for dataset curation and finetuning of instruct variant of the Bilingual OpenHathi model. The resultin…☆23Updated last year
- List of papers on hallucination detection in LLMs.☆931Updated last month
- BABILong is a benchmark for LLM evaluation using the needle-in-a-haystack approach.☆209Updated 3 months ago
- batched loras☆345Updated last year
- ☆43Updated 2 months ago
- A collection of LogitsProcessors to customize and enhance LLM behavior for specific tasks.☆337Updated last month
- A compact LLM pretrained in 9 days by using high quality data☆320Updated 4 months ago
- Fast & Simple repository for pre-training and fine-tuning T5-style models☆1,007Updated 11 months ago
- Deep learning for dummies. All the practical details and useful utilities that go into working with real models.☆811Updated 2 weeks ago
- awesome synthetic (text) datasets☆291Updated last month
- This repository collects all relevant resources about interpretability in LLMs☆368Updated 9 months ago
- ☆15Updated 6 months ago
- The Truth Is In There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction☆388Updated last year
- Tutorial for how to build BERT from scratch☆97Updated last year
- Stanford NLP Python library for understanding and improving PyTorch models via interventions☆786Updated last week
- 🧠 A study guide to learn about Transformers☆11Updated last year