inference code for mixtral-8x7b-32kseqlen
☆104Dec 12, 2023Updated 2 years ago
Alternatives and similar repositories for mixtral-inference
Users that are interested in mixtral-inference are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Eh, simple and works.☆27Dec 9, 2023Updated 2 years ago
- [WIP] Transformer to embed Danbooru labelsets☆13Mar 31, 2024Updated 2 years ago
- Inference code for Mistral and Mixtral hacked up into original Llama implementation☆368Dec 9, 2023Updated 2 years ago
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format☆27Jul 12, 2023Updated 2 years ago
- A toolkit for inference and evaluation of 'mixtral-8x7b-32kseqlen' from Mistral AI☆774Dec 15, 2023Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Cerule - A Tiny Mighty Vision Model☆70Nov 9, 2025Updated 6 months ago
- ☆17Dec 5, 2023Updated 2 years ago
- ☆13Feb 28, 2024Updated 2 years ago
- ☆13Sep 17, 2022Updated 3 years ago
- ☆13Dec 22, 2023Updated 2 years ago
- ☆23Oct 19, 2024Updated last year
- This is the official repository for "LatentMan: Generating Consistent Animated Characters using Image Diffusion Models" [CVPRW 2024]☆22Jul 21, 2024Updated last year
- Ongoing research training transformer models at scale☆37Jan 19, 2024Updated 2 years ago
- ☆868Dec 8, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit☆63Jun 21, 2023Updated 2 years ago
- ☆10Jan 16, 2024Updated 2 years ago
- ☆25Dec 21, 2023Updated 2 years ago
- Make AES-GCM safe to use with random nonces, for any practical number of messages.☆19Sep 16, 2025Updated 8 months ago
- Simple embedding -> text model trained on a small subset of Wikipedia sentences.☆158Aug 5, 2023Updated 2 years ago
- Ultra low overhead NVIDIA GPU telemetry plugin for telegraf with memory temperature readings.☆63Jul 8, 2024Updated last year
- Full finetuning of large language models without large memory requirements☆94Sep 22, 2025Updated 8 months ago
- ☆134Nov 24, 2023Updated 2 years ago
- Zeus LLM Trainer is a rewrite of Stanford Alpaca aiming to be the trainer for all Large Language Models☆70Aug 27, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆16Mar 2, 2024Updated 2 years ago
- Experimental method to use reference video to drive motion in generations without training in ComfyUI.☆37Apr 9, 2024Updated 2 years ago
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆83Sep 10, 2023Updated 2 years ago
- ☆27Aug 25, 2023Updated 2 years ago
- ☆24May 5, 2024Updated 2 years ago
- ☆16Apr 23, 2024Updated 2 years ago
- Code from the paper Reflection for the Masses by Charlotte Herzeel, Pascal Costanza, and Theo D'Hondt.☆15Jun 21, 2021Updated 4 years ago
- ☆12Mar 25, 2024Updated 2 years ago
- React Frontend for stable diffusion☆27Sep 23, 2022Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆16Apr 7, 2024Updated 2 years ago
- ☆24Dec 10, 2023Updated 2 years ago
- Huggingface Backup - Jupyter, Colab and Python Script☆10Jan 20, 2026Updated 4 months ago
- Retro styled terminal shell☆26May 8, 2024Updated 2 years ago
- High-performance GEMM implementation optimized for NVIDIA H100 GPUs, leveraging Hopper architecture's TMA, WGMMA, and Thread Block Cluste…☆10Dec 4, 2024Updated last year
- Generate textbook-quality synthetic LLM pretraining data☆508Oct 19, 2023Updated 2 years ago
- Port of Facebook's LLaMA model in C/C++☆21Nov 6, 2023Updated 2 years ago