A side project that follows all the acceleration tricks in tinyllama, with the minimal modification to the huggingface transformers code.
☆13Sep 2, 2024Updated last year
Alternatives and similar repositories for tinyllama
Users that are interested in tinyllama are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆17Dec 19, 2024Updated last year
- ☆10Jun 6, 2023Updated 2 years ago
- [ACL'26] EvoToken-DLM (Beyond Hard Masks: Progressive Token Evolution for Diffusion Language)☆48Apr 7, 2026Updated 3 weeks ago
- ☆16Oct 16, 2024Updated last year
- ☆22Dec 1, 2021Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- 2024CCF国际AIOps挑战赛-赛道二(GLM4):基于检索增强的运维知识问答挑战赛解决方案分享。☆14Jul 5, 2024Updated last year
- Finetune GPT2 for text summarization☆17Aug 16, 2021Updated 4 years ago
- ☆19Dec 4, 2025Updated 5 months ago
- Implementation of our ACL 2020 paper: Structured Tuning for Semantic Role Labeling☆18Apr 2, 2024Updated 2 years ago
- [NeurIPS 2024] Low rank memory efficient optimizer without SVD☆33Jul 1, 2025Updated 10 months ago
- Code for Pushdown Layers from our EMNLP 2023 paper☆29Dec 3, 2023Updated 2 years ago
- RADLADS training code☆40May 7, 2025Updated 11 months ago
- Code for Repl4NLP paper "A Cross-Task Analysis of Text Span Representations"☆21Nov 4, 2022Updated 3 years ago
- Official repository Flash Local Linear Attention☆23Apr 23, 2026Updated last week
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Accepted by ACL 2025☆30Aug 13, 2025Updated 8 months ago
- A PyTorch wrapper of parallel exclusive scan in CUDA☆12May 25, 2023Updated 2 years ago
- Engineering the state of RNN language models (Mamba, RWKV, etc.)☆32May 25, 2024Updated last year
- Code and data for "A Systematic Assessment of Syntactic Generalization in Neural Language Models"☆29Jun 18, 2021Updated 4 years ago
- Cluster doctor skills☆15Feb 20, 2026Updated 2 months ago
- ☆16Aug 19, 2024Updated last year
- [NeurIPS 2023] Sparse Modular Activation for Efficient Sequence Modeling☆40Dec 2, 2023Updated 2 years ago
- ☆52Jan 28, 2024Updated 2 years ago
- GPT2 implementation in C++ using Ort☆26Jan 28, 2021Updated 5 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- HGRN2: Gated Linear RNNs with State Expansion☆57Aug 20, 2024Updated last year
- Dataset for Unified Editing, EMNLP 2023. This is a model editing dataset where edits are natural language phrases.☆24Sep 4, 2024Updated last year
- Fork of NACA from Google Code☆13Feb 25, 2010Updated 16 years ago
- Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity (ACL 2025, oral)☆32Jun 14, 2025Updated 10 months ago
- Lightweight Python Wrapper for OpenVINO, enabling LLM inference on NPUs☆29Dec 17, 2024Updated last year
- ☆20Apr 25, 2026Updated last week
- Tokenflood is a load testing framework for simulating arbitary loads on instruction-tuned LLMs☆45Apr 26, 2026Updated last week
- [ICML 2023] "Data Efficient Neural Scaling Law via Model Reusing" by Peihao Wang, Rameswar Panda, Zhangyang Wang☆14Jan 4, 2024Updated 2 years ago
- Yad2 smart scraper with a minimal setup☆20Jun 18, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Parallel processing with sequential output, respecting order of input☆10Feb 20, 2023Updated 3 years ago
- ☆11Aug 20, 2024Updated last year
- Code for the forget-only version of the LSTM in the paper "The unreasonable effectiveness of the forget gate"☆29May 16, 2018Updated 7 years ago
- ☆15Apr 15, 2026Updated 3 weeks ago
- Commands that will make you more comfortable with the ROCm toolkit.☆18Aug 1, 2024Updated last year
- Official Repo for Error-Free Linear Attention is a Free Lunch: Exact Solution from Continuous-Time Dynamics☆73Mar 26, 2026Updated last month
- Predict the performance of LLM inference services☆23Sep 18, 2025Updated 7 months ago