Training code for Baby-Llama, our submission to the strict-small track of the BabyLM challenge.
☆85Oct 18, 2023Updated 2 years ago
Alternatives and similar repositories for BabyLlama
Users that are interested in BabyLlama are comparing it to the libraries listed below
Sorting:
- Code for pre-training BabyLM baseline models.☆16Jun 19, 2023Updated 2 years ago
- ☆16May 27, 2025Updated 9 months ago
- [ICCAD 2025] Squant☆15Jul 3, 2025Updated 8 months ago
- Experiments with BitNet inference on CPU☆55Apr 1, 2024Updated last year
- Cascade Speculative Drafting☆33Apr 2, 2024Updated last year
- [CVPR’25] PIVRG & ConsMTL☆21Oct 21, 2025Updated 4 months ago
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆39Jan 12, 2024Updated 2 years ago
- Official implementation of ECCV 2024 paper: Take A Step Back: Rethinking the Two Stages in Visual Reasoning☆14Jun 1, 2025Updated 9 months ago
- Less is More: Task-aware Layer-wise Distillation for Language Model Compression (ICML2023)☆40Aug 28, 2023Updated 2 years ago
- [NeurIPS '25] Multi-Token Prediction Needs Registers☆27Dec 14, 2025Updated 2 months ago
- Implementation of "Decoding-time Realignment of Language Models", ICML 2024.☆21Jun 17, 2024Updated last year
- Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation, ICML 2024☆22Jun 26, 2024Updated last year
- OpenBA-V2: 3B LLM (Large Language Model) with T5 architecture, utilizing model pruning technique and continuing pretraining from OpenBA-1…☆25May 10, 2024Updated last year
- Code repository for the public reproduction of the language modelling experiments on "MatFormer: Nested Transformer for Elastic Inference…☆31Nov 14, 2023Updated 2 years ago
- An innovative method expediting LLMs via streamlined semi-autoregressive generation and draft verification.☆26Apr 15, 2025Updated 10 months ago
- LTG-Bert☆34Jan 8, 2024Updated 2 years ago
- ☆15Sep 3, 2025Updated 6 months ago
- [ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning☆642Mar 4, 2024Updated 2 years ago
- [Preprint] Why is the State of Neural Network Pruning so Confusing? On the Fairness, Comparison Setup, and Trainability in Network Prunin…☆41Sep 9, 2025Updated 5 months ago
- This repository collects papers for "A Survey on Knowledge Distillation of Large Language Models". We break down KD into Knowledge Elicit…☆1,260Mar 9, 2025Updated 11 months ago
- (ACL 2025 oral) SCOPE: Optimizing KV Cache Compression in Long-context Generation☆34May 28, 2025Updated 9 months ago
- Inference Llama 2 in one file of pure JavaScript(HTML)☆36May 20, 2025Updated 9 months ago
- ☆580Sep 7, 2023Updated 2 years ago
- A Framework for Evaluating AI Agent Safety in Realistic Environments☆30Oct 2, 2025Updated 5 months ago
- Code for "APTBench: Benchmarking Agentic Potential of Base LLMs During Pre-Training"☆38Dec 23, 2025Updated 2 months ago
- This is a repository of personal studies on "storytelling with data".☆16Jan 28, 2021Updated 5 years ago
- Code for ACL 2024 accepted paper titled "SAPT: A Shared Attention Framework for Parameter-Efficient Continual Learning of Large Language …☆38Jan 13, 2025Updated last year
- Y-Agent Studio 是一个面向 企业级应用 的Agent开发套,Y-Agent是其中的核心模块。 包含了:支持智能体编排、RAG、流程日志、单元测试、流程测试、语料生产等垂直领域非常需要的功能。 智能体编排可以在同一个流程中,同时支持多智能体协作和流程混合编排…☆25Oct 4, 2025Updated 5 months ago
- Code for "Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes"☆30Mar 28, 2024Updated last year
- Imperative deep learning framework with customized GPU and CPU backend☆30Jul 25, 2023Updated 2 years ago
- [ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Mod…☆39Mar 11, 2024Updated last year
- Official PyTorch implementation of DistiLLM: Towards Streamlined Distillation for Large Language Models (ICML 2024)☆251Mar 13, 2025Updated 11 months ago
- Multi-Candidate Speculative Decoding☆39Apr 22, 2024Updated last year
- Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry☆42Jan 15, 2024Updated 2 years ago
- Unity TTS plugin: Piper neural synthesis + OpenJTalk Japanese + Unity AI Inference Engine. Windows/Mac/Linux/Android/iOS ready. High-qual…☆18Updated this week
- Learn how to create impactful AI Agents using Agno AI Python Package☆13Jul 31, 2025Updated 7 months ago
- ☆13Jun 17, 2025Updated 8 months ago
- ☆11Feb 13, 2021Updated 5 years ago
- A Python script to delete all comment and submission data from a given Reddit account.☆11Jan 5, 2021Updated 5 years ago