sandlogic / SandLogic-Lexicons
SandLogic Lexicons
☆17Updated 4 months ago
Alternatives and similar repositories for SandLogic-Lexicons:
Users that are interested in SandLogic-Lexicons are comparing it to the libraries listed below
- serving a torch model using Celery, Redis and RabbitMQ to serve users asynchronously☆20Updated last year
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆34Updated 2 months ago
- This repository shows various ways of deploying a vision model (TensorFlow) from 🤗 Transformers.☆29Updated 2 years ago
- Notebook and Scripts that showcase running quantized diffusion models on consumer GPUs☆38Updated 3 months ago
- Machine Learning Agility (MLAgility) benchmark and benchmarking tools☆38Updated 2 months ago
- MLFlow End to End Workshop at Chandigarh University☆11Updated 2 years ago
- 🤝 Trade any tensors over the network☆30Updated last year
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆35Updated 9 months ago
- Notes on quantization in neural networks☆70Updated last year
- Quantization of LLMs and benchmarking.☆10Updated 10 months ago
- Complete implementation of Llama2 with/without KV cache & inference 🚀☆47Updated 8 months ago
- Build Agentic workflows with function calling using open LLMs☆26Updated 2 weeks ago
- PyTorch at the Edge: Deploying Over 964 TIMM Models on Android with TorchScript and Flutter.☆44Updated last year
- Benchmarks of different devices I have come across☆19Updated 2 months ago
- Article about deploying machine learning models using grpc, pytorch and asyncio☆27Updated 2 years ago
- Set of scripts to finetune LLMs☆36Updated 10 months ago
- ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization☆102Updated 4 months ago
- Open Source Projects from Pallas Lab☆20Updated 3 years ago
- Code for NeurIPS LLM Efficiency Challenge☆55Updated 10 months ago
- Mixed precision training from scratch with Tensors and CUDA☆21Updated 9 months ago
- [EMNLP Findings 2024] MobileQuant: Mobile-friendly Quantization for On-device Language Models☆55Updated 4 months ago
- QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX☆137Updated this week
- A general 2-8 bits quantization toolbox with GPTQ/AWQ/HQQ/VPTQ, and export to onnx/onnx-runtime easily.☆159Updated last week
- Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)☆171Updated this week
- ☆28Updated last year
- Repository for CPU Kernel Generation for LLM Inference☆25Updated last year
- NVIDIA Riva runnable tutorials☆123Updated 2 months ago
- Notebooks for fine tuning pali gemma☆93Updated last month