okuvshynov / cubestat
Horizon chart for CPU/GPU/Neural Engine utilization monitoring on Apple M1/M2 and nVidia GPUs on Linux
☆25Updated last week
Alternatives and similar repositories for cubestat:
Users that are interested in cubestat are comparing it to the libraries listed below
- A super simple web interface to perform blind tests on LLM outputs.☆28Updated last year
- First token cutoff sampling inference example☆29Updated last year
- Finetune your embeddings in-browser☆32Updated last year
- Benchmarks comparing PyTorch and MLX on Apple Silicon GPUs☆78Updated 9 months ago
- The Prime Intellect CLI provides a powerful command-line interface for managing GPU resources across various providers☆18Updated 2 weeks ago
- A text-to-SQL prototype on the northwind sqlite dataset☆12Updated 6 months ago
- llm plugin for Cerebras fast inference API☆24Updated last month
- A minimalistic C++ Jinja templating engine for LLM chat templates☆132Updated last week
- Visualize expert firing frequencies across sentences in the Mixtral MoE model☆17Updated last year
- ☆15Updated last year
- Resources regarding evML (edge verified machine learning)☆16Updated 3 months ago
- The DPAB-α Benchmark☆20Updated 3 months ago
- ☆53Updated 11 months ago
- Training hybrid models for dummies.☆20Updated 3 months ago
- Concatenated documentation for use with LLMs☆29Updated this week
- A library for incremental loading of large PyTorch checkpoints☆56Updated 2 years ago
- 🛠 Self-hosted, fast, and consistent remote configuration for apps.☆15Updated 2 years ago
- Exploration of Vector database Index for fast approximate nearest neighbour search.☆22Updated 8 months ago
- A command-line utility to manage MLX models between your Hugging Face cache and LM Studio.☆33Updated 2 months ago
- A python command-line tool to download & manage MLX AI models from Hugging Face.☆17Updated 7 months ago
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.☆76Updated 4 months ago
- Run Llama 2 using MLX on macOS☆34Updated last year
- A fork of llama3.c used to do some R&D on inferencing☆20Updated 4 months ago
- Transformer GPU VRAM estimator☆59Updated last year
- Lightweight Llama 3 8B Inference Engine in CUDA C☆47Updated last month
- Some tough questions to test new models.☆27Updated last year
- Implementation of nougat that focuses on processing pdf locally.☆81Updated 3 months ago
- A simple MLX implementation for pretraining LLMs on Apple Silicon.☆28Updated 3 months ago
- Command line tool for Deep Infra cloud ML inference service☆30Updated 10 months ago
- A simple github actions script to build a llamafile and uploads to huggingface☆14Updated last year