1CatAI/1Cat-vLLM

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/1CatAI/1Cat-vLLM)

1CatAI / 1Cat-vLLM

vLLM fork for Tesla V100 (SM70) with AWQ 4-bit support, CUDA 12.8 build flow, and validated Qwen3.5 27B/35B deployment on multi-GPU V100.

☆312

Alternatives and similar repositories for 1Cat-vLLM

Users that are interested in 1Cat-vLLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

QianboZang / KG-HTC
View on GitHub
ECAI 2025
☆20May 4, 2026Updated 3 weeks ago
smpanaro / apple-silicon-4bit-quant
View on GitHub
Supporting code for "LLMs for your iPhone: Whole-Tensor 4 Bit Quantization"
☆11Mar 31, 2024Updated 2 years ago
gpengzhi / CrossConST-MT
View on GitHub
Code for Findings of ACL 2023 paper "Improving Zero-shot Multilingual Neural Machine Translation by Leveraging Cross-lingual Consistency …
☆10Jul 18, 2023Updated 2 years ago
Neutrino1998 / search_with_langchain
View on GitHub
Building a quick conversation-based search demo with langchain.
☆10Apr 2, 2024Updated 2 years ago
Rice-Field / NABirds
View on GitHub
Modification of a dataset for bird body and face detection
☆15Jul 13, 2018Updated 7 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
jiaohuix / nmt_data_tools
View on GitHub
machine translation data process tools
☆10Apr 29, 2024Updated 2 years ago
RRanddom / tf-bilinear-cnn
View on GitHub
☆17Mar 6, 2019Updated 7 years ago
Phoenix1327 / ML-ZSL
View on GitHub
☆11Nov 12, 2018Updated 7 years ago
janunger / aqbanking-php
View on GitHub
A wrapper to use AqBanking CLI from a PHP context
☆13Dec 5, 2018Updated 7 years ago
linjieccc / PaddleNLP
View on GitHub
Easy-to-use and Fast NLP library with awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications.
☆12Mar 13, 2024Updated 2 years ago
Qwen-Applications / STAR
View on GitHub
STAR: Similarity-guided Teacher-Assisted Refinement for Super-Tiny Function Calling Models
☆48Apr 23, 2026Updated last month
lensesio / redux-lenses-streaming-example
View on GitHub
☆11Jul 8, 2023Updated 2 years ago
Vokturz / fast-embeddings-api
View on GitHub
fast-embeddings-api
☆16Nov 23, 2023Updated 2 years ago
eunomia-bpf / chatrepo
View on GitHub
A Github App to chat with Your GitHub Repo's Issues Using ChatGPT
☆16Mar 8, 2023Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
jiaohuix / PaddleSeq
View on GitHub
PaddleSeq
☆10Mar 28, 2023Updated 3 years ago
kingsamchen / ezio
View on GitHub
A tiny and efficient non-blocking or asynchronous network library
☆13May 4, 2019Updated 7 years ago
zju-vipa / EVS-Net
View on GitHub
☆12May 20, 2022Updated 4 years ago
kyegomez / SoundStream
View on GitHub
Implementation of SoundtStream from the paper: "SoundStream: An End-to-End Neural Audio Codec"
☆13Jan 27, 2025Updated last year
kenhktsui / anyclassifier
View on GitHub
One Line To Build Zero-Data Classifiers in Minutes
☆65Sep 25, 2024Updated last year
Connum / npm-pinyin2ipa
View on GitHub
Converts Mandarin Chinese pinyin notation to IPA (international phonetic alphabet) notation
☆19Nov 28, 2023Updated 2 years ago
chaoshen999 / llm-tools
View on GitHub
大语言模型工具集
☆28Aug 1, 2025Updated 9 months ago
gpengzhi / Bi-SimCut
View on GitHub
Code for NAACL 2022 main conference paper "Bi-SimCut: A Simple Strategy for Boosting Neural Machine Translation"
☆12May 8, 2023Updated 3 years ago
digitaldex / piMeter_hardware
View on GitHub
Eagle Files for piMeter EnergyMonitor
☆13Sep 3, 2018Updated 7 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
vanquish630 / BaldGAN
View on GitHub
Make any person bald!! Component of the paper: Learning to regulate 3D head shape by removing occluding hair from in-the-wild images.
☆12Jun 6, 2022Updated 3 years ago
kyegomez / CogNetX
View on GitHub
CogNetX is an advanced, multimodal neural network architecture inspired by human cognition. It integrates speech, vision, and video proce…
☆21May 12, 2026Updated 2 weeks ago
junhahyung / MagiCapture
View on GitHub
☆11Feb 26, 2024Updated 2 years ago
KsanaDock / KsanaChatBot
View on GitHub
ChatBot is an AI chatbot service based on Spring Boot, providing functions such as AI dialogue, user profile management, and the Big Five…
☆30Apr 9, 2026Updated last month
laelhalawani / gguf_modeldb
View on GitHub
A quick and optimized solution to manage llama based gguf quantized models, download gguf files, retreive messege formatting, add more mo…
☆13Jan 13, 2024Updated 2 years ago
ahmdtaha / FineGrainedVisualRecognition
View on GitHub
Fine grained visual recognition tensorflow baseline on CUB, Stanford Cars, Dogs, Aircrafts, and Flower102.
☆24Oct 9, 2021Updated 4 years ago
lucataco / cog-ip_adapter-sdxl-face
View on GitHub
Attempt at cog wrapper for IP_Adapter-face for SDXL
☆15Nov 25, 2024Updated last year
simonw / llm-rag
View on GitHub
Answer questions against collections stored in LLM using Retrieval Augmented Generation
☆30Jan 29, 2024Updated 2 years ago
cousintiz / Driver-s-Attention-Monitoring-System
View on GitHub
open-source first release (OpenCV, Deepface, YOLOv8, Roboflow)
☆15Jan 2, 2025Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
pashpashpash / mcp-atlassian
View on GitHub
☆18Feb 18, 2025Updated last year
zhibaishouheilab / HealthiVert-GAN
View on GitHub
HealthiVert-GAN, a novel deep-learning framework designed to generate pseudo-healthy vertebral images. These images simulate the pre-frac…
☆12Nov 3, 2025Updated 6 months ago
PetroGPT / PetroGPT
View on GitHub
石油领域大语言模型
☆18Feb 22, 2024Updated 2 years ago
AIAnytime / Code-Llama-GGUF-Demo
View on GitHub
Code Llama GGUF Demo
☆10Aug 28, 2023Updated 2 years ago
jumon / himitsu
View on GitHub
An official implementation of the paper "Addressing Segmentation Ambiguity in Neural Linguistic Steganography"
☆14Nov 12, 2022Updated 3 years ago
marliesvanderwees / dds-nmt
View on GitHub
Dynamic data selection for neural machine translation
☆20Jan 28, 2018Updated 8 years ago
wangyePHD / SigStyle
View on GitHub
☆15Mar 2, 2025Updated last year