纯c++的全平台llm加速库,支持python调用,支持baichuan, glm, llama, moss基座,手机端流畅运行chatglm-6B级模型单卡可达10000+token / s,
☆42Aug 16, 2023Updated 2 years ago
Alternatives and similar repositories for baichuan-speedup
Users that are interested in baichuan-speedup are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 实现了Baichuan-Chat微调,Lora、QLora等各种微调方式,一键运行。☆70Aug 15, 2023Updated 2 years ago
- ggml implementation of the baichuan13b model (adapted from llama.cpp)☆55Jul 27, 2023Updated 2 years ago
- PFCC 社区博客☆14Apr 12, 2026Updated last week
- 基于baichuan-7b的开源多模态大语言模型☆72Dec 7, 2023Updated 2 years ago
- 演示 vllm 对中文大语言模型的神奇效果☆31Nov 4, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- STM32实现的循迹,红外避障,超声波避障,蓝牙遥控,红外遥控五种基本功能的小车☆15May 25, 2024Updated last year
- This sample shows how to use the oneAPI Video Processing Library (oneVPL) to perform a single and multi-source video decode and preproces…☆15Jun 15, 2023Updated 2 years ago
- 我利用在windows10上编译的darknet,实现了安全帽检测功能,编写了一个简单的mfc demo利用我们生成的模型来实现目标检测功能。Safety Helmet Wearing Test☆24Nov 7, 2019Updated 6 years ago
- Transformer related optimization, including BERT, GPT☆17Jul 29, 2023Updated 2 years ago
- ChatGLM2-6B-Explained☆36Jul 28, 2023Updated 2 years ago
- ChatGLM-6B添加了RLHF的实现,以及部分核心代码的逐行讲解 ,实例部分是做了个新闻短标题的生成,以及指定context推荐的RLHF的实现☆88Aug 16, 2023Updated 2 years ago
- 用于生成文本纠错模型(如Gector)需要的大量数据。☆14Jan 5, 2023Updated 3 years ago
- Resources for Large Language Model Inference☆17Dec 29, 2023Updated 2 years ago
- an android sample using native activity and opengles and egl engine☆17Jul 8, 2017Updated 8 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- CHATGPT-In-Jupyter☆11Jun 2, 2023Updated 2 years ago
- 拍拍贷"魔镜杯”风控大赛☆12Dec 22, 2016Updated 9 years ago
- IntLLaMA: A fast and light quantization solution for LLaMA☆18Jul 21, 2023Updated 2 years ago
- This repository open-sources our GEC system submitted by THU KELab (sz) in the CCL2023-CLTC Track 1: Multidimensional Chinese Learner Tex…☆15Nov 25, 2023Updated 2 years ago
- Cuda Version Image Processing API☆40Mar 17, 2019Updated 7 years ago
- 多轮共情对话模型PICA☆98Sep 11, 2023Updated 2 years ago
- a lightweight deep learning framework for CSK60XX serial products☆25Apr 3, 2026Updated 2 weeks ago
- High-level Rust library that binds to Poppler to extract text from a PDF☆11Dec 16, 2020Updated 5 years ago
- 金融大脑-金融智能NLP服务 竞赛☆17Apr 27, 2019Updated 6 years ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆15Jan 11, 2023Updated 3 years ago
- 1st Solution For Conversational Multi-Doc QA Workshop & International Challenge @ WSDM'24 - Xiaohongshu.Inc☆159Jul 25, 2025Updated 8 months ago
- This project provides a face recoganization system via opencv4☆18Jan 16, 2019Updated 7 years ago
- 使用tensorflow.keras实现《动手学深度学习》☆10Jan 21, 2020Updated 6 years ago
- AIxCC: automated vulnerability repair via LLMs, search, and static analysis☆12Jul 16, 2024Updated last year
- Examples of demo deployment using Gradio. Image Classification, Live Webcam Segmentation, APIs , Tunneling etc.☆17Oct 17, 2022Updated 3 years ago
- Create RP training data from a VN, using GPT-4☆18Nov 2, 2023Updated 2 years ago
- 基于官方源码deepstream-test1修改,调用rtsp摄像头,并推理显示结果☆15Mar 11, 2020Updated 6 years ago
- C++ and CUDA extensions for Python/Pytorch and GPU Accelerated Augmentation.☆35Nov 30, 2022Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- 百度QA100万数据集☆45Nov 30, 2023Updated 2 years ago
- fastllm是后端无依赖的高性能大模型推理库。同时支持张量并行推理稠密模型和混合模式推理MOE模型,任意10G以上显卡即可推理满血DeepSeek。双路9004/9005服务器+单显卡部署DeepSeek满血满精度原版模型,单并发20tps;INT4量化模型单并发30tp…☆4,189Apr 10, 2026Updated last week
- ☆11May 15, 2019Updated 6 years ago
- 面向对象学习小项目,学生信息管理系统☆10Oct 6, 2019Updated 6 years ago
- A 13B large language model developed by Baichuan Intelligent Technology☆2,933Sep 6, 2023Updated 2 years ago
- An experimental project for paddle python IR.☆15Dec 4, 2023Updated 2 years ago
- accelerate generating vector by using onnx model☆18Jan 23, 2024Updated 2 years ago