GLM Series Edge Models
☆163Jun 12, 2025Updated 11 months ago
Alternatives and similar repositories for GLM-Edge
Users that are interested in GLM-Edge are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A Toolkit for Running On-device Large Language Models (LLMs) in APP☆83Jul 4, 2024Updated last year
- Our 2nd-gen LMM☆34May 22, 2024Updated 2 years ago
- Valley is a cutting-edge multimodal large model designed to handle a variety of tasks involving text, images, and video data.☆283May 8, 2026Updated last month
- ☆190Mar 13, 2026Updated 2 months ago
- Fast instruction tuning with Llama2☆11Apr 8, 2024Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- GLM-4-Voice | 端到端中英语音对话模型☆3,184Dec 5, 2024Updated last year
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆273Aug 6, 2025Updated 10 months ago
- GPTQ inference TVM kernel☆40Apr 25, 2024Updated 2 years ago
- An open-sourced end-to-end VLM-based GUI Agent☆1,183Apr 4, 2025Updated last year
- [EMNLP 2024] Official repository for paper "From the Least to the Most: Building a Plug-and-Play Visual Reasoner via Data Synthesis"☆22Oct 15, 2024Updated last year
- Reflect-RL: Two-Player Online RL Fine-Tuning for LMs☆18Jul 19, 2025Updated 10 months ago
- Strong and Open Vision Language Assistant for Mobile Devices☆1,355Apr 15, 2024Updated 2 years ago
- Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving stat…☆1,577Jun 14, 2025Updated 11 months ago
- GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型☆7,066Jul 4, 2025Updated 11 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ✨✨[NeurIPS 2025] VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction☆2,513Mar 28, 2025Updated last year
- ☆325Sep 18, 2024Updated last year
- Stable Diffusion in TensorRT 8.5+☆15Mar 19, 2023Updated 3 years ago
- A CPU Realtime VLM in 500M. Surpassed Moondream2 and SmolVLM. Training from scratch with ease.☆256Apr 22, 2025Updated last year
- a benckmark for evaluating logical reasoning of LLMs☆23Jan 25, 2024Updated 2 years ago
- An acceleration library that supports arbitrary bit-width combinatorial quantization operations☆245Sep 30, 2024Updated last year
- GPT4V-level open-source multi-modal model based on Llama3-8B☆2,439Mar 3, 2025Updated last year
- Implementation of SoundtStream from the paper: "SoundStream: An End-to-End Neural Audio Codec"☆13Jan 27, 2025Updated last year
- 【TMM 2025🔥】 Mixture-of-Experts for Large Vision-Language Models☆2,319Jul 15, 2025Updated 10 months ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- CosyVoice语音合成简易API☆14Nov 1, 2024Updated last year
- YOLOv5在高通AI Engine Direct环境下进行QNN量化,CPU推理的项目☆17Sep 10, 2024Updated last year
- ☆18Dec 7, 2023Updated 2 years ago
- MiniCPM5-1B: A SOTA 1B on-device LLM, small yet powerful.☆9,387May 31, 2026Updated last week
- Repository for Skill Set Optimization☆14Jul 26, 2024Updated last year
- A tool convert TensorRT engine/plan to a fake onnx☆41Nov 22, 2022Updated 3 years ago
- [ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models☆11Dec 13, 2023Updated 2 years ago
- Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];☆41Jan 4, 2024Updated 2 years ago
- gradio bbox labeling tools☆11May 12, 2023Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Official repository for ACL 2025 paper "Model Extrapolation Expedites Alignment"☆75May 20, 2025Updated last year
- CogView4, CogView3-Plus and CogView3(ECCV 2024)☆1,103Mar 29, 2025Updated last year
- HunyuanDiT with TensorRT and libtorch☆18May 22, 2024Updated 2 years ago
- GGUF parser in Python☆29May 1, 2026Updated last month
- [EMNLP Findings 2024] MobileQuant: Mobile-friendly Quantization for On-device Language Models☆68Sep 22, 2024Updated last year
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format☆27Jul 12, 2023Updated 2 years ago
- ☆23Jan 29, 2026Updated 4 months ago