BobaZooba / xllm
π¦ XβLLM: Cutting Edge & Easy LLM Finetuning
β371Updated 8 months ago
Related projects: β
- β429Updated 8 months ago
- LightEval is a lightweight LLM evaluation suite that Hugging Face has been using internally with the recently released LLM data processinβ¦β667Updated this week
- Toolkit for attaching, training, saving and loading of new heads for transformer modelsβ237Updated last week
- Fine-Tuning Embedding for RAG with Synthetic Dataβ456Updated last year
- Automatically evaluate your LLMs in Google Colabβ511Updated 4 months ago
- A bagel, with everything.β306Updated 5 months ago
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for freeβ217Updated 6 months ago
- Tune any FALCON in 4-bitβ469Updated last year
- Extend existing LLMs way beyond the original training length with constant memory usage, without retrainingβ657Updated 5 months ago
- Best practices for distilling large language models.β371Updated 7 months ago
- β201Updated 7 months ago
- The repository for the code of the UltraFastBERT paperβ508Updated 5 months ago
- Domain Adapted Language Modeling Toolkit - E2E RAGβ295Updated 3 months ago
- Minimal example scripts of the Hugging Face Trainer, focused on staying under 150 linesβ195Updated 4 months ago
- An Open Source Toolkit For LLM Distillationβ284Updated last month
- LLM Workshop by Sourab Mangrulkarβ322Updated 3 months ago
- Generate textbook-quality synthetic LLM pretraining dataβ479Updated 11 months ago
- β419Updated 2 months ago
- awesome synthetic (text) datasetsβ213Updated last week
- Let's build better datasets, together!β195Updated last month
- Fast & more realistic evaluation of chat language models. Includes leaderboard.β180Updated 8 months ago
- Ungreedy subword tokenizer and vocabulary trainer for Python, Go & Javascriptβ545Updated 2 months ago
- Manage scalable open LLM inference endpoints in Slurm clustersβ217Updated 2 months ago
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifiβ¦β1,408Updated this week
- Fine-tune mistral-7B on 3090s, a100s, h100sβ701Updated 11 months ago
- β182Updated 7 months ago
- β276Updated 3 weeks ago
- Easily embed, cluster and semantically label text datasetsβ434Updated 5 months ago
- Fast lexical search library implementing BM25 in Python using Numpy and Scipyβ770Updated this week
- A comprehensive deep dive into the world of tokensβ212Updated 2 months ago