run ChatGLM2-6B in BM1684X
☆49Mar 1, 2024Updated 2 years ago
Alternatives and similar repositories for ChatGLM2-TPU
Users that are interested in ChatGLM2-TPU are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A whisper repo for TPU☆11Jun 4, 2024Updated 2 years ago
- ☆44Jul 5, 2024Updated last year
- run chatglm3-6b in BM1684X☆39Mar 1, 2024Updated 2 years ago
- Run generative AI models in sophgo BM1684X/BM1688☆286Updated this week
- DETR tensor去除推理过程无用辅助头+fp16部署再次加速+解决转tensorrt 输出全为0问题的新方法。☆11Jan 9, 2024Updated 2 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Guide to deploying deep-learning inference networks and deep vision primitives on Sophon TPU.☆35May 25, 2023Updated 3 years ago
- ☆12Dec 16, 2021Updated 4 years ago
- Machine learning compiler based on MLIR for Sophgo TPU.☆933Updated this week
- simplify >2GB large onnx model☆71Nov 30, 2024Updated last year
- Another ChatGLM2 implementation for GPTQ quantization☆55Oct 15, 2023Updated 2 years ago
- For 2022 Nvidia Hackathon☆22Jun 28, 2022Updated 3 years ago
- PyTorch in Go, using LibTorch.☆15May 21, 2019Updated 7 years ago
- ☆25Aug 14, 2025Updated 10 months ago
- ☆30Jun 2, 2022Updated 4 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- 使用onnxruntime部署夜间雾霾图像的可见度增强,包含C++和Python两个版本的程序☆13Feb 17, 2024Updated 2 years ago
- NVIDIA® TensorRT™, an SDK for high-performance deep learning inference, includes a deep learning inference optimizer and runtime that del…☆26Jul 21, 2023Updated 2 years ago
- ☆53Mar 27, 2023Updated 3 years ago
- OpenVINO™ optimization for PointPillars*☆32May 5, 2025Updated last year
- Bert TensorRT模型加速部署☆10Apr 1, 2022Updated 4 years ago
- Python scripts performing Open Vocabulary Object Detection using the YOLO-World model in ONNX. And Export the ONNX model for AXera's NPU☆12Aug 11, 2025Updated 10 months ago
- Examples for SophonSDK☆107Aug 11, 2022Updated 3 years ago
- TensorRT-FastSAM(https://github.com/CASIA-IVA-Lab/FastSAM)☆23Feb 29, 2024Updated 2 years ago
- h264的软解和硬解,基于FFmpeg和MPP☆11Mar 23, 2022Updated 4 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ffmpeg+cuvid+tensorrt+multicamera☆12Dec 31, 2024Updated last year
- SAM and lama inpaint,包含QT的GUI交互界面,实现了交互式可实时显示结果的画点、画框进行SAM,然后通过进行Inpaint,具体操作看readme里的视频。☆54Jan 30, 2024Updated 2 years ago
- PointPillars TensorRT version pretrained on MMDetection3d with WaymoOpenDataset☆23Aug 11, 2022Updated 3 years ago
- YoloV8 segmentation NPU for the RK 3566/68/88☆18Apr 30, 2024Updated 2 years ago
- ☆26Feb 2, 2024Updated 2 years ago
- JAX bindings for the flash-attention3 kernels☆23Jan 2, 2026Updated 5 months ago
- 🎉My Collections of CUDA Kernels~☆11Jun 25, 2024Updated last year
- ChatTTS is a generative speech model for daily dialogue.☆14Oct 21, 2024Updated last year
- 基于 CUDA Driver API 的 cuda 运行时环境☆16Jul 30, 2025Updated 10 months ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- ☆36Mar 29, 2023Updated 3 years ago
- export llama to onnx☆138Dec 28, 2024Updated last year
- Multiple Lidar preprocessor for BEVfusion☆11Aug 25, 2023Updated 2 years ago
- 基于Point Transformers复现点云分割任务,并使用HAQ算法进行自动量化压缩,几乎不影响精度☆26Aug 25, 2022Updated 3 years ago
- Model Quantization Benchmark☆19Apr 17, 2026Updated last month
- pose estimation code with deepstream and yolo-pose☆13Oct 14, 2022Updated 3 years ago
- ☆46Updated this week