prabhuomkar / bitbeastLinks
Experiments with Model Training, Deployment & Monitoring
☆40Updated last month
Alternatives and similar repositories for bitbeast
Users that are interested in bitbeast are comparing it to the libraries listed below
Sorting:
- Python bindings for ggml☆146Updated last year
- A ⚡️ Lightning.ai ⚡️ app demo for Voice based web search using OpenAI's Whisper and DuckDuckGo☆27Updated 2 years ago
- Curated list of awesome material on optimization techniques to make artificial intelligence faster and more efficient 🚀☆119Updated last year
- 🕹️ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.☆138Updated last year
- Various transformers for FSDP research☆38Updated 2 years ago
- The backend behind the LLM-Perf Leaderboard☆10Updated last year
- Machine Learning Serving focused on GenAI with simplicity as the top priority.☆59Updated 2 months ago
- Article about deploying machine learning models using grpc, pytorch and asyncio☆29Updated 2 years ago
- experiments with inference on llama☆104Updated last year
- The Triton backend for the ONNX Runtime.☆161Updated 2 weeks ago
- The Triton backend for the PyTorch TorchScript models.☆159Updated 2 weeks ago
- Context Manager to profile the forward and backward times of PyTorch's nn.Module☆83Updated last year
- ML/DL Math and Method notes☆63Updated last year
- This code repository contains the code used for my "Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch" blog po…☆92Updated 2 years ago
- Port of Microsoft's BioGPT in C/C++ using ggml☆85Updated last year
- ☆14Updated 3 years ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆266Updated 11 months ago
- Google TPU optimizations for transformers models☆120Updated 8 months ago
- 🏙 Interactive performance profiling and debugging tool for PyTorch neural networks.☆64Updated 8 months ago
- ClearML - Model-Serving Orchestration and Repository Solution☆156Updated last month
- Tune efficiently any LLM model from HuggingFace using distributed training (multiple GPU) and DeepSpeed. Uses Ray AIR to orchestrate the …☆59Updated 2 years ago
- 🚀 Stream inferences of real-time ML models in production to any data lake (Experimental)☆81Updated 3 years ago
- 🤝 Trade any tensors over the network☆30Updated last year
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆89Updated this week
- Plugin for deploying MLflow models to TorchServe☆110Updated 2 years ago
- Inference Vision Transformer (ViT) in plain C/C++ with ggml☆296Updated last year
- TitanML Takeoff Server is an optimization, compression and deployment platform that makes state of the art machine learning models access…☆114Updated last year
- Lightning HPO & Training Studio App☆18Updated 2 years ago
- Minimal example scripts of the Hugging Face Trainer, focused on staying under 150 lines☆197Updated last year
- Triton Model Navigator is an inference toolkit designed for optimizing and deploying Deep Learning models with a focus on NVIDIA GPUs.☆211Updated 5 months ago