youkaichao / fast_bpe_tokenizer

fast bpe tokenizer, simple to understand, easy to use

☆25

Alternatives and similar repositories for fast_bpe_tokenizer

Users that are interested in fast_bpe_tokenizer are comparing it to the libraries listed below

Sorting:

NormXU / Consistent-DynamicNTKRoPE
An Experiment on Dynamic NTK Scaling RoPE
☆64Updated last year
OpenLMLab / scaling-rope
code for Scaling Laws of RoPE-based Extrapolation
☆73Updated last year
YuchuanTian / RethinkTinyLM
[ICML'24] The official implementation of “Rethinking Optimization and Architecture for Tiny Language Models”
☆121Updated 4 months ago
liyucheng09 / llm-compressive
Longitudinal Evaluation of LLMs via Data Compression
☆32Updated 11 months ago
DAMO-NLP-SG / CLEX
[ICLR 2024] CLEX: Continuous Length Extrapolation for Large Language Models
☆77Updated last year
thu-ml / low-bit-optimizers
Low-bit optimizers for PyTorch
☆128Updated last year
NolanoOrg / sparse_quant_llms
SparseGPT + GPTQ Compression of LLMs like LLaMa, OPT, Pythia
☆41Updated 2 years ago
kyegomez / Infini-attention
Implementation of the paper: "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" from Google in pyTO…
☆55Updated 3 weeks ago
TemporaryLoRA / Temp-LoRA
☆106Updated last year
shaochenze / PatchTrain
Code for paper "Patch-Level Training for Large Language Models"
☆84Updated 6 months ago
Sanster / padding_free_llm_train
☆16Updated last year
dust-tt / llama-ssp
Experiments on speculative sampling with Llama models
☆126Updated last year
BlinkDL / WorldModel
Let us make Psychohistory (as in Asimov) a reality, and accessible to everyone. Useful for LLM grounding and games / fiction / business /…
☆40Updated 2 years ago
zsc / llama_infer
Inference script for Meta's LLaMA models using Hugging Face wrapper
☆110Updated 2 years ago
princeton-nlp / CEPE
[ACL 2024] Long-Context Language Modeling with Parallel Encodings
☆153Updated 11 months ago
Jellyfish042 / uncheatable_eval
Evaluating LLMs with Dynamic Data
☆87Updated 3 weeks ago
YuchuanTian / DiJiang
[ICML'24 Oral] The official code of "DiJiang: Efficient Large Language Models through Compact Kernelization", a novel DCT-based linear at…
☆101Updated 11 months ago
hhnqqq / GemmaLongText
☆16Updated last year
SalesforceAIResearch / GemFilter
☆78Updated 4 months ago
FreedomIntelligence / FastLLM
Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];
☆37Updated last year
kyegomez / phi-1
Plug in and play implementation of " Textbooks Are All You Need", ready for training, inference, and dataset generation
☆76Updated last year
cofe-ai / Mu-scaling
Research without Re-search: Maximal Update Parametrization Yields Accurate Loss Prediction across Scales
☆32Updated last year
gpt4life / alpagasus
Unofficial implementation of AlpaGasus
☆91Updated last year
18907305772 / FuseAI
FuseAI Project
☆86Updated 3 months ago
Glaciohound / LM-Infinite
Implementation of NAACL 2024 Outstanding Paper "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"
☆142Updated 2 months ago
SkyworkAI / Skywork-MoE
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models
☆132Updated 11 months ago
haochengxi / Train_Transformers_with_INT4
☆147Updated last year
feifeibear / Odysseus-Transformer
Odysseus: Playground of LLM Sequence Parallelism
☆69Updated 11 months ago
yifanzhang-pro / AutoMathText
Official implementation of paper "Autonomous Data Selection with Language Models for Mathematical Texts" (As Huggingface Daily Papers: ht…
☆81Updated 6 months ago
CoinCheung / gdGPT
Train llm (bloom, llama, baichuan2-7b, chatglm3-6b) with deepspeed pipeline mode. Faster than zero/zero++/fsdp.
☆95Updated last year