vLLM fork for Tesla V100 (SM70) with AWQ 4-bit support, CUDA 12.8 build flow, and validated Qwen3.5 27B/35B deployment on multi-GPU V100.
☆312May 20, 2026Updated last week
Alternatives and similar repositories for 1Cat-vLLM
Users that are interested in 1Cat-vLLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ECAI 2025☆20May 4, 2026Updated 3 weeks ago
- Supporting code for "LLMs for your iPhone: Whole-Tensor 4 Bit Quantization"☆11Mar 31, 2024Updated 2 years ago
- Code for Findings of ACL 2023 paper "Improving Zero-shot Multilingual Neural Machine Translation by Leveraging Cross-lingual Consistency …☆10Jul 18, 2023Updated 2 years ago
- Building a quick conversation-based search demo with langchain.☆10Apr 2, 2024Updated 2 years ago
- Modification of a dataset for bird body and face detection☆15Jul 13, 2018Updated 7 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- machine translation data process tools☆10Apr 29, 2024Updated 2 years ago
- ☆17Mar 6, 2019Updated 7 years ago
- ☆11Nov 12, 2018Updated 7 years ago
- A wrapper to use AqBanking CLI from a PHP context☆13Dec 5, 2018Updated 7 years ago
- Easy-to-use and Fast NLP library with awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications.☆12Mar 13, 2024Updated 2 years ago
- STAR: Similarity-guided Teacher-Assisted Refinement for Super-Tiny Function Calling Models☆48Apr 23, 2026Updated last month
- ☆11Jul 8, 2023Updated 2 years ago
- fast-embeddings-api☆16Nov 23, 2023Updated 2 years ago
- A Github App to chat with Your GitHub Repo's Issues Using ChatGPT☆16Mar 8, 2023Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- PaddleSeq☆10Mar 28, 2023Updated 3 years ago
- A tiny and efficient non-blocking or asynchronous network library☆13May 4, 2019Updated 7 years ago
- ☆12May 20, 2022Updated 4 years ago
- Implementation of SoundtStream from the paper: "SoundStream: An End-to-End Neural Audio Codec"☆13Jan 27, 2025Updated last year
- One Line To Build Zero-Data Classifiers in Minutes☆65Sep 25, 2024Updated last year
- Converts Mandarin Chinese pinyin notation to IPA (international phonetic alphabet) notation☆19Nov 28, 2023Updated 2 years ago
- 大语言模型工具集☆28Aug 1, 2025Updated 9 months ago
- Code for NAACL 2022 main conference paper "Bi-SimCut: A Simple Strategy for Boosting Neural Machine Translation"☆12May 8, 2023Updated 3 years ago
- Eagle Files for piMeter EnergyMonitor☆13Sep 3, 2018Updated 7 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Make any person bald!! Component of the paper: Learning to regulate 3D head shape by removing occluding hair from in-the-wild images.☆12Jun 6, 2022Updated 3 years ago
- CogNetX is an advanced, multimodal neural network architecture inspired by human cognition. It integrates speech, vision, and video proce…☆21May 12, 2026Updated 2 weeks ago
- ☆11Feb 26, 2024Updated 2 years ago
- ChatBot is an AI chatbot service based on Spring Boot, providing functions such as AI dialogue, user profile management, and the Big Five…☆30Apr 9, 2026Updated last month
- A quick and optimized solution to manage llama based gguf quantized models, download gguf files, retreive messege formatting, add more mo…☆13Jan 13, 2024Updated 2 years ago
- Fine grained visual recognition tensorflow baseline on CUB, Stanford Cars, Dogs, Aircrafts, and Flower102.☆24Oct 9, 2021Updated 4 years ago
- Attempt at cog wrapper for IP_Adapter-face for SDXL☆15Nov 25, 2024Updated last year
- Answer questions against collections stored in LLM using Retrieval Augmented Generation☆30Jan 29, 2024Updated 2 years ago
- open-source first release (OpenCV, Deepface, YOLOv8, Roboflow)☆15Jan 2, 2025Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆18Feb 18, 2025Updated last year
- HealthiVert-GAN, a novel deep-learning framework designed to generate pseudo-healthy vertebral images. These images simulate the pre-frac…☆12Nov 3, 2025Updated 6 months ago
- 石油领域大语言模型☆18Feb 22, 2024Updated 2 years ago
- Code Llama GGUF Demo☆10Aug 28, 2023Updated 2 years ago
- An official implementation of the paper "Addressing Segmentation Ambiguity in Neural Linguistic Steganography"☆14Nov 12, 2022Updated 3 years ago
- Dynamic data selection for neural machine translation☆20Jan 28, 2018Updated 8 years ago
- ☆15Mar 2, 2025Updated last year