pbelcak / UltraFastBERTLinks

The repository for the code of the UltraFastBERT paper

☆518

Alternatives and similar repositories for UltraFastBERT

Users that are interested in UltraFastBERT are comparing it to the libraries listed below

Sorting:

alasdairforsythe / tokenmonster
Ungreedy subword tokenizer and vocabulary trainer for Python, Go & Javascript
☆605Updated last year
pbelcak / fastfeedforward
A repository for log-time feedforward networks
☆222Updated last year
tomaarsen / attention_sinks
Extend existing LLMs way beyond the original training length with constant memory usage, without retraining
☆729Updated last year
HazyResearch / m2
Repo for "Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture"
☆560Updated 10 months ago
jondurbin / bagel
A bagel, with everything.
☆324Updated last year
SkunkworksAI / hydra-moe
☆414Updated 2 years ago
sabetAI / BLoRA
batched loras
☆347Updated 2 years ago
persimmon-ai-labs / adept-inference
Inference code for Persimmon-8B
☆412Updated 2 years ago
astramind-ai / BitMat
An efficent implementation of the method proposed in "The Era of 1-bit LLMs"
☆154Updated last year
IST-DASLab / qmoe
Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".
☆277Updated 2 years ago
Locutusque / TPU-Alignment
Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free
☆231Updated last year
HazyResearch / H3
Language Modeling with the H3 State Space Model
☆518Updated 2 years ago
Cornell-RelaxML / QuIP
Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees"
☆385Updated last year
imoneoi / multipack
Multipack distributed sampler for fast padding-free training of LLMs
☆201Updated last year
valine / NeuralFlow
Visualize the intermediate output of Mistral 7B
☆378Updated 9 months ago
apoorvumang / prompt-lookup-decoding
☆575Updated last year
center-for-humans-and-machines / transformer-heads
Toolkit for attaching, training, saving and loading of new heads for transformer models
☆290Updated 8 months ago
tysam-code / hlb-gpt
Minimalistic, extremely fast, and hackable researcher's toolbench for GPT models in 307 lines of code. Reaches <3.8 validation loss on wi…
☆351Updated last year
rmihaylov / falcontune
Tune any FALCON in 4-bit
☆464Updated 2 years ago
bobazooba / xllm
🦖 X—LLM: Cutting Edge & Easy LLM Finetuning
☆407Updated last year
Guitaricet / relora
Official code for ReLoRA from the paper Stack More Layers Differently: High-Rank Training Through Low-Rank Updates
☆467Updated last year
abacaj / train-with-fsdp
☆94Updated 2 years ago
Cornell-RelaxML / quip-sharp
☆565Updated last year
huggingface / llm-swarm
Manage scalable open LLM inference endpoints in Slurm clusters
☆276Updated last year
kyegomez / Sophia
Effortless plugin and play Optimizer to cut model training costs by 50%. New optimizer that is 2x faster than Adam on LLMs.
☆383Updated last year
QuixiAI / laserRMT
This is our own implementation of 'Layer Selective Rank Reduction'
☆239Updated last year
Vahe1994 / SpQR
☆548Updated 11 months ago
dropbox / hqq
Official implementation of Half-Quadratic Quantization (HQQ)
☆891Updated 3 weeks ago
SqueezeAILab / SqueezeLLM
[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
☆705Updated last year
yuhuixu1993 / qa-lora
Official PyTorch implementation of QA-LoRA
☆143Updated last year