rickardp / bitsandbytesLinks
8-bit CUDA functions for PyTorch
☆18Updated 6 months ago
Alternatives and similar repositories for bitsandbytes
Users that are interested in bitsandbytes are comparing it to the libraries listed below
Sorting:
- Experimental sampler to make LLMs more creative☆31Updated last year
- For inferring and serving local LLMs using the MLX framework☆103Updated last year
- Very basic framework for composable parameterized large language model (Q)LoRA / (Q)Dora fine-tuning using mlx, mlx_lm, and OgbujiPT.☆42Updated last week
- Extend the original llama.cpp repo to support redpajama model.☆118Updated 9 months ago
- Grammar checker with a keyboard shortcut for Ollama and Apple MLX with Automator on macOS.☆82Updated last year
- GPT-2 small trained on phi-like data☆66Updated last year
- ☆22Updated last year
- ☆35Updated 2 years ago
- Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vecto…☆43Updated last year
- ☆38Updated last year
- Command-line script for inferencing from models such as falcon-7b-instruct☆75Updated 2 years ago
- Scripts to create your own moe models using mlx☆90Updated last year
- An unsupervised model merging algorithm for Transformers-based language models.☆105Updated last year
- A fast minimalistic implementation of guided generation on Apple Silicon using Outlines and MLX☆55Updated last year
- A guidance compatibility layer for llama-cpp-python☆35Updated last year
- Cog wrapper for collabora/WhisperSpeech☆25Updated last year
- Simple, Fast, Parallel Huggingface GGML model downloader written in python☆24Updated last year
- ☆27Updated last year
- Model REVOLVER, a human in the loop model mixing system.☆33Updated last year
- an implementation of Self-Extend, to expand the context window via grouped attention☆118Updated last year
- ☆40Updated 2 years ago
- ☆31Updated last year
- Easily convert HuggingFace models to GGUF-format for llama.cpp☆21Updated 11 months ago
- mlx implementations of various transformers, speedups, training☆33Updated last year
- A clone of OpenAI's Tokenizer page for HuggingFace Models☆45Updated last year
- ☆73Updated last year
- ☆24Updated last year
- All the world is a play, we are but actors in it.☆50Updated this week
- Full finetuning of large language models without large memory requirements☆94Updated last year
- Simple LLM inference server☆20Updated last year