hamelsmu / llama-inferenceLinks

experiments with inference on llama

☆103

Alternatives and similar repositories for llama-inference

Users that are interested in llama-inference are comparing it to the libraries listed below

Sorting:

huggingface / llm-swarm
Manage scalable open LLM inference endpoints in Slurm clusters
☆273Updated last year
premAI-io / benchmarks
🕹️ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.
☆139Updated last year
Preemo-Inc / text-generation-inference
☆197Updated last year
jxmorris12 / bm25_pt
minimal pytorch implementation of bm25 (with sparse tensors)
☆104Updated last year
daniel-furman / sft-demos
Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.
☆78Updated last year
Upaya07 / NeurIPS-llm-efficiency-challenge
Code for NeurIPS LLM Efficiency Challenge
☆59Updated last year
MoritzLaurer / zeroshot-classifier
Notebooks for training universal 0-shot classifiers on many different tasks
☆136Updated 9 months ago
sabetAI / BLoRA
batched loras
☆346Updated 2 years ago
huggingface / data-is-better-together
Let's build better datasets, together!
☆262Updated 10 months ago
mixedbread-ai / batched
The Batched API provides a flexible and efficient way to process multiple requests in a batch, with a primary focus on dynamic batching o…
☆151Updated 3 months ago
IlyasMoutawwakil / py-txi
A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.
☆33Updated last month
ChrisHayduk / qlora-multi-gpu
QLoRA with Enhanced Multi GPU Support
☆37Updated 2 years ago
abacaj / train-with-fsdp
☆94Updated 2 years ago
AblateIt / finetune-study
Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.
☆82Updated 2 years ago
sileod / tasksource
Datasets collection and preprocessings framework for NLP extreme multitask learning
☆188Updated 3 months ago
FastEval / FastEval
Fast & more realistic evaluation of chat language models. Includes leaderboard.
☆189Updated last year
mkuchnik / relm
ReLM is a Regular Expression engine for Language Models
☆106Updated 2 years ago
imoneoi / multipack
Multipack distributed sampler for fast padding-free training of LLMs
☆201Updated last year
tcapelle / llm_recipes
A set of scripts and notebooks on LLM finetunning and dataset creation
☆110Updated last year
jondurbin / bagel
A bagel, with everything.
☆324Updated last year
dust-tt / llama-ssp
Experiments on speculative sampling with Llama models
☆125Updated 2 years ago
leehanchung / lora-instruct
Finetune Falcon, LLaMA, MPT, and RedPajama on consumer hardware using PEFT LoRA
☆103Updated 5 months ago
argilla-io / notus
Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first app…
☆169Updated last year
rmihaylov / mpttune
Tune MPTs
☆84Updated 2 years ago
Locutusque / TPU-Alignment
Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free
☆231Updated 11 months ago
davanstrien / haiku-dpo
Using open source LLMs to build synthetic datasets for direct preference optimization
☆68Updated last year
Muhtasham / summarization-eval
📝 Reference-Free automatic summarization evaluation with potential hallucination detection
☆102Updated last year
cohere-ai / DiskVectorIndex
☆210Updated 3 months ago
IBM / fastfit
FastFit ⚡ When LLMs are Unfit Use FastFit ⚡ Fast and Effective Text Classification with Many Classes
☆212Updated last month
rwightman / genalog
Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and te…
☆43Updated last year