sandlogic / SandLogic-LexiconsLinks

SandLogic Lexicons

☆19

Alternatives and similar repositories for SandLogic-Lexicons

Users that are interested in SandLogic-Lexicons are comparing it to the libraries listed below

Sorting:

rasbt / pytorch-memory-optim
This code repository contains the code used for my "Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch" blog po…
☆92Updated 2 years ago
ariG23498 / quantized-diffusion-inference
Notebook and Scripts that showcase running quantized diffusion models on consumer GPUs
☆38Updated 11 months ago
premAI-io / benchmarks
🕹️ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.
☆139Updated last year
cray-lm / cray-lm
Cray-LM unified training and inference stack.
☆22Updated 8 months ago
neuralmagic / nm-vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
☆266Updated last year
EmbeddedLLM / vllm
vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs
☆90Updated last week
deshwalmahesh / PHUDGE
Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…
☆50Updated last year
joey00072 / ohara
Collection of autoregressive model implementation
☆86Updated 5 months ago
anyscale / e2e-llm-workflows
Fine-tune an LLM to perform batch inference and online serving.
☆112Updated 4 months ago
Pleias / Various-Finetuning
Set of scripts to finetune LLMs
☆38Updated last year
chainyo / tensorshare
🤝 Trade any tensors over the network
☆30Updated 2 years ago
ariG23498 / gemma3-object-detection
Fine tune Gemma 3 on an object detection task
☆86Updated 3 months ago
GreenBitAI / low_bit_llama
Advanced Ultra-Low Bitrate Compression Techniques for the LLaMA Family of LLMs
☆110Updated last year
Macaronlin / LLaMA3-Quantization
A repository dedicated to evaluating the performance of quantizied LLaMA3 using various quantization methods..
☆195Updated 9 months ago
Upaya07 / NeurIPS-llm-efficiency-challenge
Code for NeurIPS LLM Efficiency Challenge
☆59Updated last year
IST-DASLab / Quartet
☆102Updated this week
HabanaAI / Gaudi-solutions
Full End-to-End examples showing how to use First-gen Gaudi and Gaudi2 in common use cases
☆12Updated 10 months ago
hkproj / quantization-notes
Notes on quantization in neural networks
☆104Updated last year
ThinamXx / Meta-llama
Complete implementation of Llama2 with/without KV cache & inference 🚀
☆48Updated last year
intel / neural-speed
An innovative library for efficient LLM inference via low-bit quantization
☆349Updated last year
samchaineau / llm_slerp_generation
Repo hosting codes and materials related to speeding LLMs' inference using token merging.
☆36Updated last week
adithya-s-k / YoloGemma
Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detectio…
☆84Updated last year
aniketmaurya / discord-llm-bot
Fun project: LLM powered RAG Discord Bot that works seamlessly on CPU
☆31Updated last year
hahnyuan / PB-LLM
PB-LLM: Partially Binarized Large Language Models
☆156Updated last year
aniketmaurya / fastserve-ai
Machine Learning Serving focused on GenAI with simplicity as the top priority.
☆58Updated last week
IlyasMoutawwakil / py-txi
A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.
☆33Updated last month
wejoncy / QLLM
A general 2-8 bits quantization toolbox with GPTQ/AWQ/HQQ/VPTQ, and export to onnx/onnx-runtime easily.
☆180Updated 6 months ago
huggingface / kernel-builder
👷 Build compute kernels
☆158Updated this week
astramind-ai / BitMat
An efficent implementation of the method proposed in "The Era of 1-bit LLMs"
☆154Updated last year
stas00 / ml-ways
ML/DL Math and Method notes
☆64Updated last year