andrewargatkiny / dense-attention
This is the repo for DenseAttention and DANet - fast and conceptually simple modification of standard attention and Transformer
☆10Updated this week
Alternatives and similar repositories for dense-attention:
Users that are interested in dense-attention are comparing it to the libraries listed below
- Compression schema for gradients of activations in backward pass☆44Updated last year
- ☆18Updated last month
- MMLU eval for RU/EN☆15Updated last year
- ☆71Updated 8 months ago
- RuCLIP tiny (Russian Contrastive Language–Image Pretraining) is a neural network trained to work with different pairs (images, texts).☆34Updated 2 years ago
- RUSSE 2022: Russian Text Detoxification Based on Parallel Corpora☆20Updated last month
- Fast, Modern, Memory Efficient, and Low Precision PyTorch Optimizers☆92Updated 9 months ago
- HomebrewNLP in JAX flavour for maintable TPU-Training☆50Updated last year
- ☆20Updated last year
- ☆20Updated 9 months ago
- Implementation of OpenAI paper with Simple Noise Scale on Fastai V2☆19Updated 4 years ago
- ☆22Updated last year
- Evalica, your favourite evaluation toolkit☆36Updated this week
- Noise-Contrastive Visualization☆55Updated last year
- FusionBrain Challenge 2.0: creating multimodal multitask model☆16Updated 2 years ago
- Russian Artificial Text Detection☆17Updated 2 years ago
- Various transformers for FSDP research☆37Updated 2 years ago
- (re)Implementation of Learning Multi-level Dependencies for Robust Word Recognition☆17Updated 9 months ago
- Creating multimodal multitask models☆50Updated 2 years ago
- RuTransform: python framework for adversarial attacks and text data augmentation for Russian☆19Updated last year
- Transformer with Mu-Parameterization, implemented in Jax/Flax. Supports FSDP on TPU pods.☆30Updated this week
- Named Entity Oriented Sentiment Analysis Task for mass-media texts☆12Updated 11 months ago
- Graph-Based Clustering using connected components and spanning trees.☆25Updated 3 years ago
- Framework for processing and filtering datasets☆27Updated 9 months ago
- ☆28Updated 5 months ago
- QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning P…☆34Updated last year
- MERA (Multimodal Evaluation for Russian-language Architectures) is a new open benchmark for the Russian language for evaluating fundament…☆62Updated 6 months ago
- ☆12Updated last year
- ☆26Updated 3 weeks ago
- Effective LLM Alignment Toolkit☆128Updated 3 weeks ago