huggingface / api-inference-community
☆161Updated 2 weeks ago
Related projects: ⓘ
- ☆201Updated 7 months ago
- Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first app…☆161Updated 8 months ago
- Hugging Face's Zapier Integration 🤗⚡️☆47Updated last year
- The package used to build the documentation of our Hugging Face repos☆82Updated this week
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆64Updated 2 months ago
- Use OpenAI with HuggingChat by emulating the text_generation_inference_server☆45Updated last year
- manage histories of LLM applied applications☆86Updated 10 months ago
- Some simple scripts that I use day-to-day when working with LLMs and Huggingface Hub☆154Updated 11 months ago
- HuggingChat like UI in Gradio☆63Updated last year
- ☆83Updated last year
- Merge Transformers language models by use of gradient parameters.☆193Updated last month
- Exploring finetuning public checkpoints on filter 8K sequences on Pile☆115Updated last year
- QLoRA: Efficient Finetuning of Quantized LLMs☆74Updated 5 months ago
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free☆217Updated 6 months ago
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆81Updated last year
- Finetune Falcon, LLaMA, MPT, and RedPajama on consumer hardware using PEFT LoRA☆99Updated last month
- GPTQLoRA: Efficient Finetuning of Quantized LLMs with GPTQ☆96Updated last year
- An all-new Language Model That Processes Ultra-Long Sequences of 100,000+ Ultra-Fast☆131Updated 2 weeks ago
- Manage scalable open LLM inference endpoints in Slurm clusters☆217Updated 2 months ago
- experiments with inference on llama☆106Updated 3 months ago
- Convenient wrapper for fine-tuning and inference of Large Language Models (LLMs) with several quantization techniques (GTPQ, bitsandbytes…☆139Updated 11 months ago
- Google TPU optimizations for transformers models☆62Updated this week
- The Next Generation Multi-Modality Superintelligence☆69Updated 2 weeks ago
- Python bindings for ggml☆125Updated 2 weeks ago
- Landmark Attention: Random-Access Infinite Context Length for Transformers QLoRA☆123Updated last year
- Reimplementation of the task generation part from the Alpaca paper☆118Updated last year
- Experiments with generating opensource language model assistants☆97Updated last year
- An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.☆37Updated 10 months ago
- [WIP] A 🔥 interface for running code in the cloud☆86Updated last year
- This is our own implementation of 'Layer Selective Rank Reduction'☆229Updated 3 months ago