zjohn77 / lightning-mlflow-hfLinks
Use QLoRA to tune LLM in PyTorch-Lightning w/ Huggingface + MLflow
☆64Updated 2 years ago
Alternatives and similar repositories for lightning-mlflow-hf
Users that are interested in lightning-mlflow-hf are comparing it to the libraries listed below
Sorting:
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆78Updated last year
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆32Updated 3 months ago
- Hugging Face Inference Toolkit used to serve transformers, sentence-transformers, and diffusers models.☆88Updated last month
- Code for NeurIPS LLM Efficiency Challenge☆59Updated last year
- minimal pytorch implementation of bm25 (with sparse tensors)☆104Updated 2 months ago
- Repository containing awesome resources regarding Hugging Face tooling.☆48Updated last year
- experiments with inference on llama☆103Updated last year
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆68Updated last month
- 🎨 Imagine what Picasso could have done with AI. Self-host your StableDiffusion API.☆50Updated 2 years ago
- 🤝 Trade any tensors over the network☆30Updated 2 years ago
- ☆90Updated 5 months ago
- ☆124Updated last year
- ☆53Updated 10 months ago
- Notebooks for training universal 0-shot classifiers on many different tasks☆137Updated last year
- Datamodels for hugging face tokenizers☆86Updated 3 weeks ago
- Let's build better datasets, together!☆267Updated last year
- Pre-train Static Word Embeddings☆94Updated 3 months ago
- QLoRA with Enhanced Multi GPU Support☆37Updated 2 years ago
- Using open source LLMs to build synthetic datasets for direct preference optimization☆71Updated last year
- A toolkit implementing advanced methods to transfer models and model knowledge across tokenizers.☆59Updated 5 months ago
- Command Line Interface for Hugging Face Inference Endpoints☆66Updated last year
- Trully flash implementation of DeBERTa disentangled attention mechanism.☆67Updated 3 months ago
- Truly flash T5 realization!☆71Updated last year
- Google TPU optimizations for transformers models☆131Updated last week
- ☆138Updated 4 months ago
- PyLate efficient inference engine☆68Updated 3 months ago
- [WIP] A 🔥 interface for running code in the cloud☆86Updated 2 years ago
- Code for Zero-Shot Tokenizer Transfer☆142Updated 11 months ago
- A fast implementation of T5/UL2 in PyTorch using Flash Attention☆112Updated last month
- Supercharge huggingface transformers with model parallelism.☆77Updated 5 months ago