prabhuomkar / bitbeastLinks
Experiments with Model Training, Deployment & Monitoring
β40Updated 6 months ago
Alternatives and similar repositories for bitbeast
Users that are interested in bitbeast are comparing it to the libraries listed below
Sorting:
- Python bindings for ggmlβ147Updated last year
- Curated list of awesome material on optimization techniques to make artificial intelligence faster and more efficient πβ119Updated 2 years ago
- πΉοΈ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.β138Updated last year
- The Triton backend for the PyTorch TorchScript models.β173Updated this week
- The Triton backend for the ONNX Runtime.β173Updated this week
- Presents comprehensive benchmarks of XLA-compatible pre-trained models in Keras.β37Updated 2 years ago
- Port of Microsoft's BioGPT in C/C++ using ggmlβ86Updated last year
- Machine Learning Serving focused on GenAI with simplicity as the top priority.β59Updated last month
- Various transformers for FSDP researchβ38Updated 3 years ago
- The backend behind the LLM-Perf Leaderboardβ11Updated last year
- Google TPU optimizations for transformers modelsβ134Updated 2 weeks ago
- A high-throughput and memory-efficient inference and serving engine for LLMsβ267Updated 2 months ago
- experiments with inference on llamaβ103Updated last year
- GGML implementation of BERT model with Python bindings and quantization.β58Updated last year
- ClearML Fractional GPU - Run multiple containers on the same GPU with driver level memory limitation β¨ and compute time-slicingβ88Updated 2 months ago
- Lightning HPO & Training Studio Appβ19Updated 2 years ago
- Triton Model Navigator is an inference toolkit designed for optimizing and deploying Deep Learning models with a focus on NVIDIA GPUs.β217Updated last week
- Advanced Ultra-Low Bitrate Compression Techniques for the LLaMA Family of LLMsβ110Updated 2 years ago
- Inference Vision Transformer (ViT) in plain C/C++ with ggmlβ306Updated last year
- SGLang is fast serving framework for large language models and vision language models.β32Updated 2 months ago
- Article about deploying machine learning models using grpc, pytorch and asyncioβ30Updated 3 years ago
- ποΈ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Oβ¦β328Updated 4 months ago
- OpenAI compatible API for TensorRT LLM triton backendβ220Updated last year
- Module, Model, and Tensor Serialization/Deserializationβ287Updated this week
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.β32Updated 4 months ago
- This code repository contains the code used for my "Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch" blog poβ¦β92Updated 2 years ago
- Triton CLI is an open source command line interface that enables users to create, deploy, and profile models served by the Triton Inferenβ¦β73Updated 2 weeks ago
- Plugin for deploying MLflow models to TorchServeβ110Updated 2 years ago
- Inference of Mamba and Mamba2 models in pure Cβ196Updated 2 weeks ago
- β198Updated 2 years ago