Yikai-Liao/efficient_bpe

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Yikai-Liao/efficient_bpe)

Yikai-Liao / efficient_bpe

An Efficent BPE Algorithm Faster then Hugging Face Tokenizer's Implementation

☆13

Alternatives and similar repositories for efficient_bpe

Users that are interested in efficient_bpe are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

DezhiKong00 / Sentencepiece-chinese-bbpe
View on GitHub
使用Sentencepiece对中文语料进行分词
☆13Nov 30, 2023Updated 2 years ago
ChengYuqing1998 / Modern-Transformer-NMT-zh2en
View on GitHub
An educational Chinese-to-English NMT project featuring the classic encoder-decoder Transformer and a configurable modern decoder-only GP…
☆20Jun 7, 2026Updated last month
youngjoey-ai / tracerag
View on GitHub
一个强调工程化、可观测、可测试、可扩展的 RAG 项目。TraceRAG 的目标不是只把答案“生成出来”，而是把文档导入、切块、向量化、检索、带来源回答、评估与后续 tracing 拆成可独立验证的阶段，逐步演进成一个可维护、可解释、可复盘的生产级 RAG。
☆15Apr 2, 2026Updated 3 months ago
yanivle / fast_minbpe
View on GitHub
☆18Feb 6, 2025Updated last year
y-young / f1music
View on GitHub
校园音乐征集投票系统 A system for electing annual school music
☆10Jul 7, 2026Updated 2 weeks ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
E5Anant / UnisonAI
View on GitHub
The UnisonAI Multi-Agent Framework built on custom workflow which allows ai agents to talk together and provides a flexible and extensibl…
☆23Feb 24, 2026Updated 5 months ago
Megum1 / UNIT
View on GitHub
[ECCV'24] UNIT: Backdoor Mitigation via Automated Neural Distribution Tightening
☆10Dec 18, 2025Updated 7 months ago
sxysxy / Taurix
View on GitHub
Taurix OS kernel. Taurix 系统内核，操作系统原理实(xjb)践(写)
☆12Dec 20, 2020Updated 5 years ago
DCRUNNN / QuantitiveTrading
View on GitHub
量化交易网站，软工三大作业迭代三，团队项目
☆11Mar 8, 2018Updated 8 years ago
RAZZULLIX / fast_topk_batched
View on GitHub
High-performance batched Top-K selection for CPU inference. Up to 80x faster than PyTorch, optimized for LLM sampling with AVX2 SIMD.
☆18Mar 20, 2026Updated 4 months ago
Pchen0 / Web-Wechat-Bot
View on GitHub
这是一个可通过网页远程登录管理、可接入讯飞星火、ChatGPT等大语言模型的微信聊天机器人，使用微信网页版协议。
☆16Feb 20, 2024Updated 2 years ago
cszhangyi / NewsApp
View on GitHub
NewsApp包含客户端源码、服务端源码、数据库文件。基于Miscrosoft人工智能项目ProjectOxford中的Recognition Emotion做的，主要是基于用户的面部表情来推送不同类别的新闻。 Emotion API可以参考：https://www.p…
☆10Mar 2, 2016Updated 10 years ago
EliasCai / shanghai-citywalk
View on GitHub
「城语」APP基于A级景区、历史古迹、文物保护单位等基础数据，利用先进的大模型能力实现智能化的Citywalk 路线规划，包括设计一条路线、生成路线攻略、生成景点的推荐理由等三大核心功能；利用大模型减少了人工编辑和推荐的工作量，并可以根据游客的需求进行个性化定制，提升了游客…
☆19Feb 20, 2024Updated 2 years ago
weiliang822 / ML-BigHW
View on GitHub
同济大学计科机器学习大作业
☆10Mar 22, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
pluveto / bpe_v3
View on GitHub
基于 BPE 实现的中文分词。优化：预处理，并行计算，多字词，多词表
☆14May 14, 2022Updated 4 years ago
Guan-JW / GMM-Isolated-Speech-Recognition
View on GitHub
基于MFCC特征构建单核GMM的0-9独立词语音识别，MFCC，GMM，sklearn，Isolated word recognition。
☆10Nov 18, 2020Updated 5 years ago
gradio-app / sambanova-gradio
View on GitHub
☆23Nov 4, 2024Updated last year
SreejanPersonal / Gemini-Live-2.0
View on GitHub
☆15Jan 6, 2025Updated last year
sakurayun / bili-live-monitor
View on GitHub
监控哔哩哔哩直播间数据，实时保存至数据库，并在内置网页上查看精致的可视化统计图表。
☆12Jan 4, 2022Updated 4 years ago
YuvrajSingh-mist / SmolLlama
View on GitHub
So, I trained a Llama a 130M architecture I coded from ground up to build a small instruct model from scratch. Trained on FineWeb dataset…
☆18Mar 26, 2025Updated last year
liu-qingyuan / faster_whisper_gradio
View on GitHub
Real time faster whisper gradio
☆24Aug 17, 2025Updated 11 months ago
Charmve / EmotionCube
View on GitHub
🐾 EmotionCube: An intelligent companion robot is designed based on expression recognition and intelligent speech.
☆19May 27, 2024Updated 2 years ago
6zHAOyi / BadVision
View on GitHub
This is an official code repository for CVPR 2025 paper BadVision.
☆15Nov 18, 2025Updated 8 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
ninehills / embedding_finetuning
View on GitHub
Fine-tuning embedding models.
☆14Nov 25, 2024Updated last year
tzafon / lightcone
View on GitHub
Lightcone: SDK for computer use agents
☆16Jul 16, 2026Updated last week
lkwq007 / flux-flax
View on GitHub
JAX port of FLUX.1 models using flax.nnx
☆23Sep 28, 2024Updated last year
mikenote / homekeeping
View on GitHub
这是一个大学生互联网+的大创项目：“一点到家”——云滇家政平台助力乡村振兴，系统前台：微信小程序，后端springboot，数据库mysql。属于一个非常值得推荐的项目，系统源码简单宜读，干净简洁、注释详细，可二次开发。创意满满，贴近生活，缓解就业压力，为农民增收致富，促进…
☆14Jun 17, 2023Updated 3 years ago
Carla-de-Beer / p5js-projects
View on GitHub
P5js sketches (Processing for JavaScript)
☆19Jan 21, 2026Updated 6 months ago
dutzxf1993 / stock-data-analysis
View on GitHub
python爬取股市数据，并对各个行业股票行情、财务数据进行重构分析
☆10Jul 26, 2020Updated 6 years ago
reiniscimurs / Bezier-Curve
View on GitHub
Python class for creating and optimizing quadratic and cubic Bezier curves and path smoothing implementation.
☆47Jun 19, 2021Updated 5 years ago
BochaAI / open-webui-Bocha
View on GitHub
By leveraging Bocha AI Search API , your AI applications can now access high-quality, up-to-date knowledge from billions of web pages and…
☆21Feb 9, 2025Updated last year
Philosober / AI-fundamentals-2025-Spring
View on GitHub
2024-2025下半学年人工智能导论（拔尖班）
☆17Jun 16, 2025Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
SuchScar / FootPrint
View on GitHub
大学整理项目一：一个旅游踩点项目，踩点即对一个个事先有记录的有意思的旅行停驻点进行拜访游玩并留下你的足，这些停驻点我们称之为关注点。在该系统中还可以自己规划行程，事先计划好要前往的关注点，路线然后按照系统上的路线规划进行旅游，在旅游中可以写一些文字，发一些图片，整个行程完…
☆10Apr 27, 2018Updated 8 years ago
Likhithsai2580 / JARVIS-RE-J4E
View on GitHub
Jarvis made by Kaushik Shresth Reverse Engineered by Likhi
☆16Feb 16, 2025Updated last year
seanzhang-zhichen / Qwen-WisdomVast
View on GitHub
Qwen-WisdomVast is a large model trained on 1 million high-quality Chinese multi-turn SFT data, 200,000 English multi-turn SFT data, and …
☆17Apr 12, 2024Updated 2 years ago
JonathanYang0127 / omnimimic
View on GitHub
Cross-Embodiment Robot Learning Codebase
☆52Apr 20, 2024Updated 2 years ago
NiuTrans / ODEs-in-Vision-and-Language
View on GitHub
An introduction to ODEs and their applications in vision and language
☆15Feb 26, 2026Updated 5 months ago
OpenEarthLab / FengWu-4DVar
View on GitHub
Integrating Large Weather Models with Data Assimilation
☆25Jun 2, 2024Updated 2 years ago
yoongi43 / VRVQ
View on GitHub
Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"
☆11Apr 10, 2025Updated last year