BlinkDL / LM-Trick-Questions
Here we collect trick questions and failed tasks for open source LLMs to improve them.
☆32Updated 2 years ago
Alternatives and similar repositories for LM-Trick-Questions:
Users that are interested in LM-Trick-Questions are comparing it to the libraries listed below
- 32 times longer context window than vanilla Transformers and up to 4 times longer than memory efficient Transformers.☆48Updated last year
- Here we will test various linear attention designs.☆60Updated last year
- Let us make Psychohistory (as in Asimov) a reality, and accessible to everyone. Useful for LLM grounding and games / fiction / business /…☆40Updated 2 years ago
- Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"☆37Updated last year
- Structural Pruning for LLaMA☆54Updated last year
- Explorations into adversarial losses on top of autoregressive loss for language modeling☆35Updated 2 months ago
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit☆63Updated last year
- GoldFinch and other hybrid transformer components☆45Updated 9 months ago
- RWKV model implementation☆37Updated last year
- sigma-MoE layer☆18Updated last year
- Griffin MQA + Hawk Linear RNN Hybrid☆86Updated last year
- Tools for content datamining and NLP at scale☆43Updated 10 months ago
- SparseGPT + GPTQ Compression of LLMs like LLaMa, OPT, Pythia☆41Updated 2 years ago
- Implementation of the Mamba SSM with hf_integration.☆56Updated 8 months ago
- Un-*** 50 billions multimodality dataset☆24Updated 2 years ago
- ☆32Updated last year
- Utilities for Training Very Large Models☆58Updated 7 months ago
- https://x.com/BlinkDL_AI/status/1884768989743882276☆27Updated this week
- A simple reproducible template to implement AI research papers☆24Updated 7 months ago
- ☆20Updated 11 months ago
- This repo is based on https://github.com/jiaweizzhao/GaLore☆27Updated 7 months ago
- The reproduct of the paper - Aligner: Achieving Efficient Alignment through Weak-to-Strong Correction☆22Updated 11 months ago
- Exploration into the proposed "Self Reasoning Tokens" by Felipe Bonetto☆55Updated 11 months ago
- Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"☆97Updated 7 months ago
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated last year
- HGRN2: Gated Linear RNNs with State Expansion☆54Updated 8 months ago
- ☆46Updated 9 months ago
- imagetokenizer is a python package, helps you encoder visuals and generate visuals token ids from codebook, supports both image and video…☆33Updated 10 months ago
- My explorations into editing the knowledge and memories of an attention network☆34Updated 2 years ago
- A large-scale RWKV v6, v7(World, ARWKV, PRWKV) inference. Capable of inference by combining multiple states(Pseudo MoE). Easy to deploy o…☆35Updated this week