This is a text generation method which returns a generator, streaming out each token in real-time during inference, based on Huggingface/Transformers.
☆96Mar 11, 2024Updated 2 years ago
Alternatives and similar repositories for transformers-stream-generator
Users that are interested in transformers-stream-generator are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official Repository for Paper "BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Compet…☆18Aug 28, 2024Updated last year
- aigc evals☆10Dec 2, 2023Updated 2 years ago
- Sampling-Based Minimum Bayes-Risk Decoding for Neural Machine Translation☆16Oct 14, 2022Updated 3 years ago
- My favorite GNU/Linux flavor on the Microsoft Surface Duo.☆10Feb 7, 2024Updated 2 years ago
- multilabel categorical crossentropy☆15Apr 26, 2020Updated 5 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- mSimCSE: Multilingual SimCSE☆33Nov 14, 2022Updated 3 years ago
- TTS Client for Coqui TTS server☆13Jan 7, 2023Updated 3 years ago
- accelerate generating vector by using onnx model☆18Jan 23, 2024Updated 2 years ago
- A library for data streaming and augmentation☆21May 5, 2025Updated 11 months ago
- Implementation of "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"☆40Nov 11, 2024Updated last year
- ⚡ boost inference speed of GPT models in transformers by onnxruntime☆52Aug 20, 2023Updated 2 years ago
- See details in https://github.com/pytorch/xla/blob/r1.12/torch_xla/distributed/fsdp/README.md☆25Dec 22, 2022Updated 3 years ago
- Code for AAAI 2024 paper Wikiformer☆20Dec 21, 2023Updated 2 years ago
- Code for Paper "Bidirectional Aggregation Network with Occlusion Handling for Panoptic Segmentation"☆16Mar 25, 2020Updated 6 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Transformer related optimization, including BERT, GPT☆6,412Mar 27, 2024Updated 2 years ago
- This repository is TensorRT implement of PINet☆18Nov 1, 2022Updated 3 years ago
- huggingface.co/chat api. Fixed stream response & web search.☆15Oct 1, 2023Updated 2 years ago
- ☆12Apr 24, 2024Updated last year
- ☆21Feb 15, 2024Updated 2 years ago
- Harness的最佳实践,通过模块化的 Agent Skills 架构,系统支持无限扩展工具能力☆36Apr 8, 2026Updated last week
- An innovative method expediting LLMs via streamlined semi-autoregressive generation and draft verification.☆28Apr 15, 2025Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆141Dec 6, 2024Updated last year
- ☆10Oct 15, 2020Updated 5 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Just a bunch of benchmark logs for different LLMs☆121Jul 28, 2024Updated last year
- Large language Model fintuning bloom , opt , gpt, gpt2 ,llama,llama-2,cpmant and so on☆101Apr 24, 2024Updated last year
- Top 1 Millon ranked websites and top level domains (TLD)☆11Feb 4, 2026Updated 2 months ago
- Chinese CLIP models with SOTA performance.☆60Aug 28, 2023Updated 2 years ago
- Accelerating GOT-OCRv2 with VLLM☆10Nov 15, 2024Updated last year
- An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.☆5,047Apr 11, 2025Updated last year
- MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.☆2,107Jun 30, 2025Updated 9 months ago
- [ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding☆1,327Mar 6, 2025Updated last year
- Reactive Multi-language Gradio App with minimal effort☆21Oct 12, 2025Updated 6 months ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Backup your Docker Volumes☆17May 16, 2023Updated 2 years ago
- Unofficial implementation for Sigmoid Loss for Language Image Pre-Training☆11Sep 26, 2023Updated 2 years ago
- [CVPR 2022 Challenge Rank 1st] The official code for V2L: Leveraging Vision and Vision-language Models into Large-scale Product Retrieval…☆29Jul 30, 2022Updated 3 years ago
- 用ATSS训练自己的目标检测模型!! 超详细教程和PDF教程下载!!!☆10Jul 28, 2020Updated 5 years ago
- Track your mood during the job.☆12Jun 2, 2021Updated 4 years ago
- Unofficial version of https://sourceforge.net/projects/npp-plugins/files/SpeechPlugin/☆12Mar 23, 2026Updated 3 weeks ago
- Manages vllm-nccl dependency☆17Jun 3, 2024Updated last year