A side project that follows all the acceleration tricks in tinyllama, with the minimal modification to the huggingface transformers code.
☆13Sep 2, 2024Updated last year
Alternatives and similar repositories for tinyllama
Users that are interested in tinyllama are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆17Dec 19, 2024Updated last year
- [ACL'26] EvoToken-DLM (Beyond Hard Masks: Progressive Token Evolution for Diffusion Language)☆48Apr 7, 2026Updated last month
- ☆22Dec 1, 2021Updated 4 years ago
- Finetune GPT2 for text summarization☆17Aug 16, 2021Updated 4 years ago
- Implementation of our ACL 2020 paper: Structured Tuning for Semantic Role Labeling☆18Apr 2, 2024Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- [NeurIPS 2024] Low rank memory efficient optimizer without SVD☆33Jul 1, 2025Updated 10 months ago
- Code for Pushdown Layers from our EMNLP 2023 paper☆29Dec 3, 2023Updated 2 years ago
- RADLADS training code☆43May 7, 2025Updated last year
- Code for Repl4NLP paper "A Cross-Task Analysis of Text Span Representations"☆21Nov 4, 2022Updated 3 years ago
- Official repository Flash Local Linear Attention☆23Apr 23, 2026Updated last month
- A PyTorch wrapper of parallel exclusive scan in CUDA☆12May 25, 2023Updated 3 years ago
- Engineering the state of RNN language models (Mamba, RWKV, etc.)☆32May 25, 2024Updated 2 years ago
- Download all Cloudflare durable object state to a local SQLite database.☆24Mar 9, 2024Updated 2 years ago
- Cluster doctor skills☆15Feb 20, 2026Updated 3 months ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- [NeurIPS 2024] Official implementation of NeurIPS 2024 paepr "Flow Priors for Linear Inverse Problems via Iterative Corrupted Trajectory …☆26Feb 24, 2025Updated last year
- ☆16Aug 19, 2024Updated last year
- [NeurIPS 2023] Sparse Modular Activation for Efficient Sequence Modeling☆40Dec 2, 2023Updated 2 years ago
- [ICLR 2022] Code for paper "Exploring Extreme Parameter Compression for Pre-trained Language Models"(https://arxiv.org/abs/2205.10036)☆22May 24, 2023Updated 3 years ago
- HGRN2: Gated Linear RNNs with State Expansion☆57Aug 20, 2024Updated last year
- Here we will test various linear attention designs.☆62Apr 25, 2024Updated 2 years ago
- Dataset for Unified Editing, EMNLP 2023. This is a model editing dataset where edits are natural language phrases.☆24Sep 4, 2024Updated last year
- Fork of NACA from Google Code☆13Feb 25, 2010Updated 16 years ago
- Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity (ACL 2025, oral)☆34Jun 14, 2025Updated 11 months ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- continous batching and parallel acceleration for RWKV6☆22Jun 28, 2024Updated last year
- ☆20Updated this week
- Tokenflood is a load testing framework for simulating arbitary loads on instruction-tuned LLMs☆45May 18, 2026Updated last week
- Parallel processing with sequential output, respecting order of input☆10Feb 20, 2023Updated 3 years ago
- Test Orchestrator for Performance and Scalability of AI pLatforms☆18May 11, 2026Updated 2 weeks ago
- A Google Contacts server using MCP☆30Oct 22, 2025Updated 7 months ago
- ☆11Aug 20, 2024Updated last year
- Code for the forget-only version of the LSTM in the paper "The unreasonable effectiveness of the forget gate"☆29May 16, 2018Updated 8 years ago
- ☆48Jun 16, 2025Updated 11 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Exercise to implement DDD with CQRS using Dapr for Pub-Sub.☆13Jul 28, 2021Updated 4 years ago
- Commands that will make you more comfortable with the ROCm toolkit.☆18Aug 1, 2024Updated last year
- Code for Max-Margin Contrastive Learning - AAAI 2022☆17Apr 25, 2022Updated 4 years ago
- Official Repo for Error-Free Linear Attention is a Free Lunch: Exact Solution from Continuous-Time Dynamics☆76Mar 26, 2026Updated 2 months ago
- HaSTL: A fast GPU implementation of STL decomposition with missing values and support for both CUDA and OpenCL☆13Sep 11, 2023Updated 2 years ago
- Vocabulary Parallelism☆26Mar 10, 2025Updated last year
- Executable form of the MiFID II RTS (Regulatory Technical Standard) documents.☆18Jun 17, 2018Updated 7 years ago