nelson-liu / websiteLinks
☆13Updated 3 years ago
Alternatives and similar repositories for website
Users that are interested in website are comparing it to the libraries listed below
Sorting:
- Randomized Positional Encodings Boost Length Generalization of Transformers☆83Updated last year
- Experimental playground for benchmarking language model (LM) architectures, layers, and tricks on smaller datasets. Designed for flexible…☆87Updated 2 weeks ago
- Repository of the paper "Accelerating Transformer Inference for Translation via Parallel Decoding"☆121Updated last year
- Implementation of RQ Transformer, proposed in the paper "Autoregressive Image Generation using Residual Quantization"☆122Updated 3 years ago
- QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning P…☆35Updated 2 years ago
- My attempts at applying Soundstream design on learned tokenization of text and then applying hierarchical attention to text generation☆88Updated last year
- SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model, Accepted to IEEE SLT 2022☆116Updated 3 years ago
- Speech2Vec Reality Check☆84Updated 2 years ago
- Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts☆119Updated last year
- Implementation of N-Grammer, augmenting Transformers with latent n-grams, in Pytorch☆76Updated 2 years ago
- [NeurIPS 2022] "Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Spee…☆17Updated 2 years ago
- Python code for handling the Clotho dataset.☆85Updated 5 years ago
- Code for the paper "The Impact of Positional Encoding on Length Generalization in Transformers", NeurIPS 2023☆138Updated last year
- ☆73Updated 4 years ago
- LL3M: Large Language and Multi-Modal Model in Jax☆74Updated last year
- Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]☆68Updated last year
- A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021☆48Updated 3 years ago
- ☆50Updated last year
- Staged Training for Transformer Language Models☆33Updated 3 years ago
- Beyond Straight-Through☆105Updated 2 years ago
- Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)☆63Updated 3 years ago
- A Pytorch Implementations for Various Vector Quantization Methods☆33Updated 4 years ago
- Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…☆53Updated 2 years ago
- Implementation of Insertion-deletion Denoising Diffusion Probabilistic Models☆30Updated 3 years ago
- code for paper "Accessing higher dimensions for unsupervised word translation"☆22Updated 2 years ago
- ResiDual: Transformer with Dual Residual Connections, https://arxiv.org/abs/2304.14802☆96Updated 2 years ago
- My explorations into editing the knowledge and memories of an attention network☆35Updated 2 years ago
- Implementation of Hourglass Transformer, in Pytorch, from Google and OpenAI☆97Updated 3 years ago
- A fusion of a linear layer and a cross entropy loss, written for pytorch in triton.☆71Updated last year
- Easily run PyTorch on multiple GPUs & machines☆53Updated last week