hamelsmu / llama-inference
experiments with inference on llama
â104Updated 9 months ago
Alternatives and similar repositories for llama-inference:
Users that are interested in llama-inference are comparing it to the libraries listed below
- Manage scalable open LLM inference endpoints in Slurm clustersâ253Updated 8 months ago
- Lightweight demos for finetuning LLMs. Powered by đ¤ transformers and open-source datasets.â73Updated 4 months ago
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.â82Updated last year
- â199Updated last year
- đšī¸ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.â136Updated 7 months ago
- minimal pytorch implementation of bm25 (with sparse tensors)â97Updated last year
- Fast & more realistic evaluation of chat language models. Includes leaderboard.â185Updated last year
- Multipack distributed sampler for fast padding-free training of LLMsâ186Updated 7 months ago
- â92Updated last year
- Check for data drift between two OpenAI multi-turn chat jsonl files.â37Updated 11 months ago
- A bagel, with everything.â317Updated 11 months ago
- â113Updated 5 months ago
- FastFit ⥠When LLMs are Unfit Use FastFit ⥠Fast and Effective Text Classification with Many Classesâ189Updated 5 months ago
- The Batched API provides a flexible and efficient way to process multiple requests in a batch, with a primary focus on dynamic batching oâĻâ125Updated 3 months ago
- ReLM is a Regular Expression engine for Language Modelsâ103Updated last year
- Minimal example scripts of the Hugging Face Trainer, focused on staying under 150 linesâ198Updated 10 months ago
- Experiments with generating opensource language model assistantsâ97Updated last year
- Code repository for the c-BTM paperâ106Updated last year
- Tune MPTsâ84Updated last year
- Exploring finetuning public checkpoints on filter 8K sequences on Pileâ115Updated last year
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for freeâ230Updated 4 months ago
- Datasets collection and preprocessings framework for NLP extreme multitask learningâ175Updated 2 months ago
- This project studies the performance and robustness of language models and task-adaptation methods.â147Updated 10 months ago
- Set of scripts to finetune LLMsâ36Updated 11 months ago
- Evaluating LLMs with CommonGen-Liteâ89Updated 11 months ago
- Small and Efficient Mathematical Reasoning LLMsâ71Updated last year
- QLoRA: Efficient Finetuning of Quantized LLMsâ77Updated 11 months ago
- Evaluate and Enhance Your LLM Deployments for Real-World Inference Needsâ212Updated last week
- Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)â205Updated 9 months ago
- Google TPU optimizations for transformers modelsâ103Updated last month