DezhiKong00/Sentencepiece-chinese-bbpe

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/DezhiKong00/Sentencepiece-chinese-bbpe)

DezhiKong00 / Sentencepiece-chinese-bbpe

使用Sentencepiece对中文语料进行分词

☆13

Alternatives and similar repositories for Sentencepiece-chinese-bbpe

Users that are interested in Sentencepiece-chinese-bbpe are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Yikai-Liao / efficient_bpe
View on GitHub
An Efficent BPE Algorithm Faster then Hugging Face Tokenizer's Implementation
☆13Sep 9, 2024Updated last year
HaujetZhao / SenseVoice-ONNX
View on GitHub
SenseVoice-Small 导出为 ONNX，支持热词注入，在 CTC 的输空间中通过路径匹配，1ms 内实现热词替换
☆28Jun 3, 2026Updated last month
LIRUILONGS / mtcnn_demo
View on GitHub
人脸检测服务，用于输出适合人脸识别的人脸数据集，通过 mtcnn cnn检测人脸，通过 hopenet 开源项目确定人脸是姿态，拿到头部姿态欧拉角，通过拉普拉斯算子拿到人脸模糊度，通过对mtcnn 三级网络和置信度，欧拉角阈值，模糊度设置阈值筛选合适人脸
☆14May 17, 2024Updated 2 years ago
atultiwari / LLaVA-Med
View on GitHub
Large Language-and-Vision Assistant for BioMedicine, built towards multimodal GPT-4 level capabilities.
☆10Nov 29, 2023Updated 2 years ago
dalong0514 / dalong.ITstudy
View on GitHub
computer study
☆27Jul 1, 2026Updated 3 weeks ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
MuyangDu / T5Voice
View on GitHub
T5Voice is a lightweight PyTorch implementation of T5-based text-to-speech synthesis, supporting both streaming and non-streaming speech …
☆28Nov 7, 2025Updated 8 months ago
CarlWangChina / REMAST-Real-time-Emotion-based-Music-Arrangement-with-Soft-Transition
View on GitHub
SongDriver2 achieves a balance between real-time emotion fit and soft transitions, enhancing the coherence of the generated music.
☆11Nov 15, 2025Updated 8 months ago
youngjoey-ai / tracerag
View on GitHub
一个强调工程化、可观测、可测试、可扩展的 RAG 项目。TraceRAG 的目标不是只把答案“生成出来”，而是把文档导入、切块、向量化、检索、带来源回答、评估与后续 tracing 拆成可独立验证的阶段，逐步演进成一个可维护、可解释、可复盘的生产级 RAG。
☆15Apr 2, 2026Updated 3 months ago
yanivle / fast_minbpe
View on GitHub
☆18Feb 6, 2025Updated last year
y-young / f1music
View on GitHub
校园音乐征集投票系统 A system for electing annual school music
☆10Jul 7, 2026Updated 3 weeks ago
kyegomez / FastFF
View on GitHub
Zeta implementation of a reusable and plug in and play feedforward from the paper "Exponentially Faster Language Modeling"
☆16Nov 11, 2024Updated last year
DavideGioiosa / cvae-chord-generation-complexity
View on GitHub
Modeling Harmonic Complexity using two models of Conditional Variational Autoencoders - MSc. Thesis
☆10May 16, 2023Updated 3 years ago
Megum1 / UNIT
View on GitHub
[ECCV'24] UNIT: Backdoor Mitigation via Automated Neural Distribution Tightening
☆10Dec 18, 2025Updated 7 months ago
sxysxy / Taurix
View on GitHub
Taurix OS kernel. Taurix 系统内核，操作系统原理实(xjb)践(写)
☆12Dec 20, 2020Updated 5 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
csiro-mlai / dl_hpc_starter_pack
View on GitHub
pip install the deep learning & HPC starter pack to begin your project.
☆12Nov 6, 2024Updated last year
sjhan91 / Mixture2Music_Official
View on GitHub
The implementation of "Instrument Separation of Symbolic Music by Explicitly Guided Diffusion Model"
☆15Aug 16, 2022Updated 3 years ago
zaocan666 / CollageNet
View on GitHub
code and demo of the ISMIR 2021 paper CollageNet
☆12Jul 12, 2021Updated 5 years ago
DCRUNNN / QuantitiveTrading
View on GitHub
量化交易网站，软工三大作业迭代三，团队项目
☆11Mar 8, 2018Updated 8 years ago
ZWD11 / Intelligent_warehousing_system
View on GitHub
基于SpringBoot+Python+Vue前后端分离的智能仓储系统（可复用），除账号密码登录，实现手机验证码登录，除常规与数据库交互形式外，接入大模型，可通过对话（打字或说话）的方式与数据库进行交互，也可帮助分析数据库里的数据，AI回复为markdown格式
☆21Jul 10, 2024Updated 2 years ago
Pchen0 / Web-Wechat-Bot
View on GitHub
这是一个可通过网页远程登录管理、可接入讯飞星火、ChatGPT等大语言模型的微信聊天机器人，使用微信网页版协议。
☆16Feb 20, 2024Updated 2 years ago
cszhangyi / NewsApp
View on GitHub
NewsApp包含客户端源码、服务端源码、数据库文件。基于Miscrosoft人工智能项目ProjectOxford中的Recognition Emotion做的，主要是基于用户的面部表情来推送不同类别的新闻。 Emotion API可以参考：https://www.p…
☆10Mar 2, 2016Updated 10 years ago
RLuke22 / curriculum-learning-acr
View on GitHub
ISMIR 2021: Curriculum Learning for Imbalanced Classification in Large Vocabulary Automatic Chord Recognition
☆10Nov 8, 2021Updated 4 years ago
EliasCai / shanghai-citywalk
View on GitHub
「城语」APP基于A级景区、历史古迹、文物保护单位等基础数据，利用先进的大模型能力实现智能化的Citywalk 路线规划，包括设计一条路线、生成路线攻略、生成景点的推荐理由等三大核心功能；利用大模型减少了人工编辑和推荐的工作量，并可以根据游客的需求进行个性化定制，提升了游客…
☆19Feb 20, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
zyDotwei / tianchi_similar_sentence_pairs
View on GitHub
天池“公益AI之星”挑战赛-新冠疫情相似句对判定大赛
☆16Apr 12, 2020Updated 6 years ago
bwanglzu / Maximal-Marginal-Relevance
View on GitHub
MMR for information retrieval
☆18Sep 22, 2017Updated 8 years ago
weiliang822 / ML-BigHW
View on GitHub
同济大学计科机器学习大作业
☆10Mar 22, 2025Updated last year
pluveto / bpe_v3
View on GitHub
基于 BPE 实现的中文分词。优化：预处理，并行计算，多字词，多词表
☆14May 14, 2022Updated 4 years ago
Guan-JW / GMM-Isolated-Speech-Recognition
View on GitHub
基于MFCC特征构建单核GMM的0-9独立词语音识别，MFCC，GMM，sklearn，Isolated word recognition。
☆10Nov 18, 2020Updated 5 years ago
andrebola / contrastive-mir-learning
View on GitHub
This repo contains the code to reproduce the paper: "Enriched Music Representations with Multiple Cross-modal Contrastive Learning"
☆15Jun 22, 2023Updated 3 years ago
0xCCF4 / ExpKit
View on GitHub
A framework and build automation tool to process exploits/payloads to evade antivirus and endpoint detection response products using reus…
☆11Jan 16, 2024Updated 2 years ago
nobnak / VoxelRendering
View on GitHub
☆17Jul 3, 2021Updated 5 years ago
mlop-ai / server
View on GitHub
Serving Next Generation Experimental Tracking for Machine Learning Operations
☆15Mar 5, 2026Updated 4 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
emptymonkey / ghostshell
View on GitHub
Simulates a logged in user.
☆16Jul 10, 2024Updated 2 years ago
yuyouyu32 / LLMQAEvaluate
View on GitHub
☆17Dec 1, 2023Updated 2 years ago
TomoGaSukunai / so-vits-svc
View on GitHub
基于vits与softvc的歌声音色转换模型(svc社区维护仓库）
☆14Mar 10, 2023Updated 3 years ago
moinnadeem / CDSSM
View on GitHub
Convolutional Deep Semantic Similarity Model
☆20Feb 15, 2023Updated 3 years ago
sakurayun / bili-live-monitor
View on GitHub
监控哔哩哔哩直播间数据，实时保存至数据库，并在内置网页上查看精致的可视化统计图表。
☆12Jan 4, 2022Updated 4 years ago
Charmve / EmotionCube
View on GitHub
🐾 EmotionCube: An intelligent companion robot is designed based on expression recognition and intelligent speech.
☆19May 27, 2024Updated 2 years ago
6zHAOyi / BadVision
View on GitHub
This is an official code repository for CVPR 2025 paper BadVision.
☆15Nov 18, 2025Updated 8 months ago