BlinkDL / LM-Trick-Questions
Here we collect trick questions and failed tasks for open source LLMs to improve them.
☆32Updated last year
Related projects: ⓘ
- Here we will test various linear attention designs.☆55Updated 4 months ago
- Let us make Psychohistory (as in Asimov) a reality, and accessible to everyone. Useful for LLM grounding and games / fiction / business /…☆40Updated last year
- RWKV model implementation☆38Updated last year
- Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"☆34Updated 10 months ago
- GoldFinch and other hybrid transformer components☆38Updated 2 months ago
- Structural Pruning for LLaMA☆55Updated last year
- Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"☆87Updated 8 months ago
- Exploration into the proposed "Self Reasoning Tokens" by Felipe Bonetto☆53Updated 4 months ago
- Implementation of the paper: "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" from Google in pyTO…☆48Updated last week
- Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts☆101Updated last year
- ☆22Updated 3 months ago
- ☆42Updated this week
- 32 times longer context window than vanilla Transformers and up to 4 times longer than memory efficient Transformers.☆37Updated last year
- A framework to study AI models in Reasoning, Alignment, and use of Memory (RAM).☆15Updated this week
- ☆19Updated this week
- PyTorch implementation of models from the Zamba2 series.☆63Updated last month
- ☆28Updated 3 months ago
- In-Context Alignment: Chat with Vanilla Language Models Before Fine-Tuning☆33Updated last year
- [NeurIPS 2023] Sparse Modular Activation for Efficient Sequence Modeling☆34Updated 9 months ago
- Zeta implementation of a reusable and plug in and play feedforward from the paper "Exponentially Faster Language Modeling"☆15Updated last week
- Mixture of A Million Experts☆29Updated last month
- A simple reproducible template to implement AI research papers☆21Updated last week
- ☆14Updated this week
- My explorations into editing the knowledge and memories of an attention network☆34Updated last year
- Implementation of the Mamba SSM with hf_integration.☆55Updated 2 weeks ago
- Tools for content datamining and NLP at scale☆41Updated 3 months ago
- Reference implementation of "Softmax Attention with Constant Cost per Token" (Heinsen, 2024)☆22Updated 3 months ago
- The reproduct of the paper - Aligner: Achieving Efficient Alignment through Weak-to-Strong Correction☆21Updated 3 months ago
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit☆62Updated last year