BlinkDL / LM-Trick-QuestionsLinks

Here we collect trick questions and failed tasks for open source LLMs to improve them.

☆32

Alternatives and similar repositories for LM-Trick-Questions

Users that are interested in LM-Trick-Questions are comparing it to the libraries listed below

Sorting:

kyegomez / Blockwise-Parallel-Transformer
32 times longer context window than vanilla Transformers and up to 4 times longer than memory efficient Transformers.
☆48Updated 2 years ago
BlinkDL / LinearAttentionArena
Here we will test various linear attention designs.
☆62Updated last year
BlinkDL / WorldModel
Let us make Psychohistory (as in Asimov) a reality, and accessible to everyone. Useful for LLM grounding and games / fiction / business /…
☆40Updated 2 years ago
RobertCsordas / moe_attention
Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"
☆98Updated 10 months ago
berlino / gated_linear_attention
☆106Updated last year
horseee / LLaMA-Pruning
Structural Pruning for LLaMA
☆54Updated 2 years ago
BBuf / flash-rwkv
☆32Updated last year
kaiokendev / cutoff-len-is-context-len
Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit
☆63Updated 2 years ago
RobertCsordas / moe
Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"
☆38Updated last month
BBuf / RWKV-World-HF-Tokenizer
☆34Updated last year
NolanoOrg / sparse_quant_llms
SparseGPT + GPTQ Compression of LLMs like LLaMa, OPT, Pythia
☆41Updated 2 years ago
TRI-ML / linear_open_lm
A repository for research on medium sized language models.
☆78Updated last year
tridao / flash-attention-wheels
☆52Updated last year
kyegomez / LM-Infinite
Implementation of "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"
☆40Updated 8 months ago
IST-DASLab / QIGen
Repository for CPU Kernel Generation for LLM Inference
☆26Updated 2 years ago
eth-easl / fmengine
Utilities for Training Very Large Models
☆58Updated 10 months ago
codekansas / rwkv
RWKV model implementation
☆38Updated 2 years ago
yikangshen / megablocks
☆20Updated last year
xhan77 / in-context-alignment
In-Context Alignment: Chat with Vanilla Language Models Before Fine-Tuning
☆35Updated last year
LegallyCoder / mamba-hf
Implementation of the Mamba SSM with hf_integration.
☆56Updated 11 months ago
recursal / GoldFinch-paper
GoldFinch and other hybrid transformer components
☆46Updated last year
Zyphra / Zyda_processing
☆37Updated last year
juvi21 / CoPE-cuda
Contextual Position Encoding but with some custom CUDA Kernels https://arxiv.org/abs/2405.18719
☆22Updated last year
graphcore-research / jax-scalify
JAX Scalify: end-to-end scaled arithmetics
☆16Updated 9 months ago
LAION-AI / Conditional-Pretraining-of-Large-Language-Models
☆37Updated 2 years ago
johanwind / wind_rwkv
☆24Updated last week
AlignInc / aligner-replication
The reproduct of the paper - Aligner: Achieving Efficient Alignment through Weak-to-Strong Correction
☆22Updated last year
softmax1 / Flash-Attention-Softmax-N
CUDA and Triton implementations of Flash Attention with SoftmaxN.
☆71Updated last year
lucidrains / transformer-lm-gan
Explorations into adversarial losses on top of autoregressive loss for language modeling
☆37Updated 5 months ago
kyegomez / Infini-attention
Implementation of the paper: "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" from Google in pyTO…
☆56Updated last week