alvarobartt/hf-mem

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/alvarobartt/hf-mem)

alvarobartt / hf-mem

A CLI to estimate inference memory requirements for Hugging Face models, written in Python.

☆938

Alternatives and similar repositories for hf-mem

Users that are interested in hf-mem are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

LMCache / LMCache
View on GitHub
LMCache: Supercharge Your LLM with the Fastest KV Cache Layer
☆10,938Updated this week
StarTrail-org / LEANN
View on GitHub
[MLsys2026]: RAG on Everything with LEANN. Enjoy 97% storage savings while running a fast, accurate, and 100% private RAG application on …
☆12,743Updated this week
jranaraki / vllm-tuner
View on GitHub
An intelligent tuner for vLLM that automatically monitors GPU metrics, uses Bayesian optimization to tune parameters
☆66Mar 12, 2026Updated 4 months ago
0xSojalSec / airllm
View on GitHub
Runs 405B LLMs on 8GB VRAM
☆3,043Apr 2, 2026Updated 3 months ago
AlexsJones / llmfit
View on GitHub
Hundreds of models & providers. One command to find what runs on your hardware.
☆30,892Updated this week
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
transformerlab / transformerlab-app
View on GitHub
The open source research environment for AI researchers to seamlessly train, evaluate, and scale models from local hardware to GPU cluste…
☆5,165Updated this week
unslothai / unsloth
View on GitHub
Unsloth is a local UI for training and running Gemma 4, Qwen3.6, DeepSeek, Kimi, GLM and other models.
☆69,060Updated this week
huggingface / hf-mount
View on GitHub
Mount Hugging Face Buckets and repos as local filesystems. No download, no copy, no waiting.
☆770Updated this week
ZHZisZZ / dllm
View on GitHub
dLLM: Simple Diffusion Language Modeling
☆2,656Jul 17, 2026Updated last week
VectifyAI / PageIndex
View on GitHub
📑 PageIndex: Document Index for Vectorless, Reasoning-based RAG
☆34,793Updated this week
OpenPipe / ART
View on GitHub
Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement…
☆10,545Updated this week
confident-ai / deepeval
View on GitHub
The LLM Evaluation Framework
☆17,260Updated this week
Mega4alik / ollm
View on GitHub
☆2,700Updated this week
vllm-project / llm-compressor
View on GitHub
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
☆3,602Updated this week
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
huggingface / kernels
View on GitHub
Build compute kernels and load them from the Hub.
☆715Updated this week
vllm-project / vllm
View on GitHub
A high-throughput and memory-efficient inference and serving engine for LLMs
☆87,317Updated this week
NVIDIA-NeMo / DataDesigner
View on GitHub
🎨 NeMo Data Designer: Generate high-quality synthetic data from scratch or from seed data.
☆2,132Updated this week
microsoft / agent-lightning
View on GitHub
The absolute trainer to light up AI agents.
☆17,428Jul 16, 2026Updated 2 weeks ago
huggingface / smol-course
View on GitHub
A course on aligning smol models.
☆6,708May 26, 2026Updated 2 months ago
huggingface / OpenEnv
View on GitHub
An interface library for RL post training with environments.
☆2,457Updated this week
predibase / lorax
View on GitHub
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
☆3,819May 28, 2026Updated 2 months ago
huggingface / smol2operator
View on GitHub
☆137Sep 23, 2025Updated 10 months ago
gabrielmbmb / candle-holder
View on GitHub
A Rust crate offering similar functionality to the Python transformers package using Candle.
☆15Nov 19, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
microsoft / BitNet
View on GitHub
Official inference framework for 1-bit LLMs
☆39,789Updated this week
google / langextract
View on GitHub
A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive vi…
☆37,920Updated this week
huggingface / hf-sandbox
View on GitHub
Modal-style sandbox API on top of Hugging Face Jobs
☆156Jul 6, 2026Updated 3 weeks ago
Roots-Automation / GutenOCR
View on GitHub
Open-source tools for training and evaluating Vision Language Models for OCR
☆190Jun 25, 2026Updated last month
RyanCodrai / turbovec
View on GitHub
A vector index built on TurboQuant, written in Rust with Python bindings
☆14,494Updated this week
studio-dots-ai / dots.ocr
View on GitHub
Multilingual Document Layout Parsing in a Single Vision-Language Model
☆9,042Mar 24, 2026Updated 4 months ago
FalkorDB / FalkorDB
View on GitHub
A super fast Graph Database uses GraphBLAS under the hood for its sparse adjacency matrix graph representation. Our goal is to provide th…
☆4,844Updated this week
memvid / memvid
View on GitHub
Memory layer for AI Agents. Replace complex RAG pipelines with a serverless, single-file memory layer. Give your agents instant retrieval…
☆16,074Jul 14, 2026Updated 2 weeks ago
mlabonne / llm-datasets
View on GitHub
Curated list of datasets and tools for post-training.
☆4,712Apr 29, 2026Updated 3 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
stephantul / pynife
View on GitHub
Nearly Inference Free Embeddings: make your RAG queries 500x faster
☆80Apr 27, 2026Updated 3 months ago
Ananyaiitbhilai / Text2Triple-LLM-Agent
View on GitHub
[ESWC '24] This repo is official implementation for the paper "Towards Harnessing Large Language Models as Autonomous Agents for Semantic…
☆10May 25, 2024Updated 2 years ago
StarlightSearch / EmbedAnything
View on GitHub
Highly Performant, Modular, Memory Safe and Production-ready Inference, Ingestion and Indexing built in Rust 🦀
☆1,288Jul 15, 2026Updated 2 weeks ago
SakanaAI / text-to-lora
View on GitHub
Hypernetworks that adapt LLMs for specific benchmark tasks using only textual task description as the input
☆1,296Jun 8, 2025Updated last year
dottxt-ai / outlines
View on GitHub
Structured Outputs
☆15,419Updated this week
stanfordnlp / dspy
View on GitHub
DSPy: The framework for programming—not prompting—language models
☆36,460Updated this week
gradio-app / trackio
View on GitHub
A lightweight, local-first, and free experiment tracking library from Hugging Face 🤗
☆1,611Updated this week