A side project that follows all the acceleration tricks in tinyllama, with the minimal modification to the huggingface transformers code.
☆13Sep 2, 2024Updated last year
Alternatives and similar repositories for tinyllama
Users that are interested in tinyllama are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆17Dec 19, 2024Updated last year
- Layer-Condensed KV cache w/ 10 times larger batch size, fewer params and less computation. Dramatic speed up with better task performance…☆157Apr 7, 2025Updated last year
- [ACL'26] EvoToken-DLM (Beyond Hard Masks: Progressive Token Evolution for Diffusion Language)☆46Apr 7, 2026Updated last week
- ☆16Oct 16, 2024Updated last year
- ☆22Dec 1, 2021Updated 4 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Finetune GPT2 for text summarization☆17Aug 16, 2021Updated 4 years ago
- ☆19Dec 4, 2025Updated 4 months ago
- RADLADS training code☆39May 7, 2025Updated 11 months ago
- [NeurIPS 2024] Low rank memory efficient optimizer without SVD☆33Jul 1, 2025Updated 9 months ago
- Code for Pushdown Layers from our EMNLP 2023 paper☆29Dec 3, 2023Updated 2 years ago
- [ICLR 2026] Official code for [EdiVal-Agent Automated, object-centric evaluation for multi-turn instruction-based image editing]☆26Mar 1, 2026Updated last month
- EPoG: Integrated Exploration and Sequential Manipulation on Scene Graph with LLM-based Situated Replanning☆14Feb 6, 2026Updated 2 months ago
- Code for Repl4NLP paper "A Cross-Task Analysis of Text Span Representations"☆21Nov 4, 2022Updated 3 years ago
- [ECCV 2024] Official repository of ECCV 2024 paper: Object-Conditioned Energy-Based Attention Map Alignment in Text-to-Image Diffusion M…☆15May 24, 2025Updated 10 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Conic10K: A large-scale dataset for closed-vocabulary math problem understanding. Accepted to EMNLP2023 Findings.☆33Dec 6, 2023Updated 2 years ago
- Official repository for the paper Local Linear Attention: An Optimal Interpolation of Linear and Softmax Attention For Test-Time Regressi…☆23Oct 1, 2025Updated 6 months ago
- A PyTorch wrapper of parallel exclusive scan in CUDA☆12May 25, 2023Updated 2 years ago
- Engineering the state of RNN language models (Mamba, RWKV, etc.)☆32May 25, 2024Updated last year
- Code and data for "A Systematic Assessment of Syntactic Generalization in Neural Language Models"☆29Jun 18, 2021Updated 4 years ago
- Download all Cloudflare durable object state to a local SQLite database.☆25Mar 9, 2024Updated 2 years ago
- ☆13Apr 5, 2022Updated 4 years ago
- Cluster doctor skills☆14Feb 20, 2026Updated last month
- [NeurIPS 2024] Official implementation of NeurIPS 2024 paepr "Flow Priors for Linear Inverse Problems via Iterative Corrupted Trajectory …☆26Feb 24, 2025Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ☆15Aug 19, 2024Updated last year
- [NeurIPS 2023] Sparse Modular Activation for Efficient Sequence Modeling☆40Dec 2, 2023Updated 2 years ago
- ☆51Jan 28, 2024Updated 2 years ago
- HGRN2: Gated Linear RNNs with State Expansion☆57Aug 20, 2024Updated last year
- Dataset for Unified Editing, EMNLP 2023. This is a model editing dataset where edits are natural language phrases.☆24Sep 4, 2024Updated last year
- VRKitchen: an Interactive 3D Environment for Learning Real Life Cooking Tasks. Visit the project site for more information: https://sites…☆25Oct 17, 2024Updated last year
- Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity (ACL 2025, oral)☆32Jun 14, 2025Updated 10 months ago
- Lightweight Python Wrapper for OpenVINO, enabling LLM inference on NPUs☆27Dec 17, 2024Updated last year
- A script to reorganize 'Want to go' Saved places in Google Maps into separate lists by category.☆11May 14, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- continous batching and parallel acceleration for RWKV6☆22Jun 28, 2024Updated last year
- A single-line modification to any (dualizer-based) optimizer that allows the optimizer to adapt to the scale of the gradients as they cha…☆19Jan 11, 2025Updated last year
- ☆20Updated this week
- Tokenflood is a load testing framework for simulating arbitary loads on instruction-tuned LLMs☆45Mar 20, 2026Updated 3 weeks ago
- This is an implementation of DeepStack for No Limit Texas Hold'em, extended from DeepStack-Leduc.☆25Jun 16, 2019Updated 6 years ago
- [ICML 2023] "Data Efficient Neural Scaling Law via Model Reusing" by Peihao Wang, Rameswar Panda, Zhangyang Wang☆14Jan 4, 2024Updated 2 years ago
- Yad2 smart scraper with a minimal setup☆19Jun 18, 2023Updated 2 years ago