纯c++的全平台llm加速库,支持python调用,支持baichuan, glm, llama, moss基座,手机端流畅运行chatglm-6B级模型单卡可达10000+token / s,
☆42Aug 16, 2023Updated 2 years ago
Alternatives and similar repositories for baichuan-speedup
Users that are interested in baichuan-speedup are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ggml implementation of the baichuan13b model (adapted from llama.cpp)☆55Jul 27, 2023Updated 2 years ago
- PFCC 社区博客☆14Updated this week
- 天池大赛:金融大脑-金融智能NLP服务☆16Jul 9, 2018Updated 7 years ago
- Baichuan2代码的逐行解析版本,适合小白☆211Sep 20, 2023Updated 2 years ago
- 基于baichuan-7b的开源多模态大语言模型☆72Dec 7, 2023Updated 2 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ☆23Jul 17, 2023Updated 2 years ago
- 演示 vllm 对中文大语言模型的神奇效果☆31Nov 4, 2023Updated 2 years ago
- STM32实现的循迹,红外避障,超声波避障,蓝牙遥控,红外遥控五种基本功能的小车☆18May 25, 2024Updated 2 years ago
- 中文金融大模型测评基准,六大类二十五任务、等级化评价,国内模型获得A级☆11May 6, 2024Updated 2 years ago
- Transformer related optimization, including BERT, GPT☆17Jul 29, 2023Updated 2 years ago
- 飞桨模型加密库☆10Nov 13, 2021Updated 4 years ago
- 人工智能社会保险反欺诈分析☆29Aug 5, 2018Updated 7 years ago
- 一个桌面宠物程序,现在似乎发展成为桌面便签了。桌面便签程序见develop-todolist分支。☆11Nov 17, 2024Updated last year
- 通过示例阐述如何使用pycrfsuite☆10Nov 7, 2016Updated 9 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ChatGLM-6B添加了RLHF的实现,以及部分核心代码的逐行讲解 ,实例部分是做了个新闻短标题的生成,以及指定context推荐的RLHF的实现☆88Aug 16, 2023Updated 2 years ago
- an android sample using native activity and opengles and egl engine☆17Jul 8, 2017Updated 8 years ago
- Android runtime permissions manager☆13Jul 8, 2019Updated 6 years ago
- CHATGPT-In-Jupyter☆11Jun 2, 2023Updated 3 years ago
- Comparison of existing spell checking tools☆11Mar 28, 2023Updated 3 years ago
- 聚宝盆(Cornucopia): 中文金融系列开源可商用大模型,并提供一套高效轻量化的垂直领域LLM训练框架(Pretraining、SFT、RLHF、Quantize等)☆657Jun 30, 2023Updated 2 years ago
- 拍拍贷"魔镜杯”风控大赛☆12Dec 22, 2016Updated 9 years ago
- IntLLaMA: A fast and light quantization solution for LLaMA☆18Jul 21, 2023Updated 2 years ago
- BiLLa: A Bilingual LLaMA with Enhanced Reasoning Ability☆417Jun 1, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- 使用ONNXRuntime部署面向轻量实时的M-LSD直线检测,包含C++和Python两个版本的程序☆28Jan 21, 2023Updated 3 years ago
- Code for M4LE: A Multi-Ability Multi-Range Multi-Task Multi-Domain Long-Context Evaluation Benchmark for Large Language Models☆23Jul 27, 2024Updated last year
- 多轮共情对话模型PICA☆97Sep 11, 2023Updated 2 years ago
- Playground project acting as an example for a complex LangChain workflow☆11Jun 20, 2023Updated 2 years ago
- 1st Solution For Conversational Multi-Doc QA Workshop & International Challenge @ WSDM'24 - Xiaohongshu.Inc☆157Jul 25, 2025Updated 10 months ago
- This project provides a face recoganization system via opencv4☆18Jan 16, 2019Updated 7 years ago
- 学习OpenGL的代码仓库☆15Jun 5, 2026Updated 2 weeks ago
- Examples of demo deployment using Gradio. Image Classification, Live Webcam Segmentation, APIs , Tunneling etc.☆17Oct 17, 2022Updated 3 years ago
- Deep-Learn model SSD_300x300 transplante to TensorRT(Nvidia Jetson Tx2)☆11Dec 8, 2018Updated 7 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Create RP training data from a VN, using GPT-4☆19Nov 2, 2023Updated 2 years ago
- 基于官方源码deepstream-test1修改,调用rtsp摄像头,并推理显示结果☆15Mar 11, 2020Updated 6 years ago
- Apply prompt learning in Chinese NER tasks☆13Mar 24, 2022Updated 4 years ago
- 将MNN拆解的简易前向推理框架(for study!)☆24Feb 21, 2021Updated 5 years ago
- CUDA SGEMM optimization note☆15Oct 31, 2023Updated 2 years ago
- fastllm是后端无依赖的高性能大模型推理库。同时支持张量并行推理稠密模型和混合模式推理MOE模 型,任意10G以上显卡即可推理满血DeepSeek。双路9004/9005服务器+单显卡部署DeepSeek满血满精度原版模型,单并发20tps;INT4量化模型单并发30tp…☆4,770Updated this week
- 利用开源库face_recognition实现人脸识别,用django实现网页,传输过程使用websocket.☆11Sep 8, 2019Updated 6 years ago