EleutherAI / pilev2Links

☆13

Alternatives and similar repositories for pilev2

Users that are interested in pilev2 are comparing it to the libraries listed below

Sorting:

lucidrains / memory-editable-transformer
My explorations into editing the knowledge and memories of an attention network
☆35Updated 2 years ago
renll / SeqBoat
[NeurIPS 2023] Sparse Modular Activation for Efficient Sequence Modeling
☆37Updated last year
EleutherAI / semantic-memorization
☆44Updated 7 months ago
lucidrains / einops-exts
Implementation of some personal helper functions for Einops, my most favorite tensor manipulation library ❤️
☆54Updated 2 years ago
applicaai / CCpdf
Index of URLs to pdf files all over the internet and scripts
☆24Updated 2 years ago
lucidrains / holodeck-pytorch
Implementation of a holodeck, written in Pytorch
☆18Updated last year
huggingface / m4-logs
M4 experiment logbook
☆58Updated last year
lucidrains / quartic-transformer
Exploring an idea where one forgets about efficiency and carries out attention across each edge of the nodes (tokens)
☆51Updated 3 months ago
EleutherAI / rnngineering
Engineering the state of RNN language models (Mamba, RWKV, etc.)
☆32Updated last year
eth-easl / fmengine
Utilities for Training Very Large Models
☆58Updated 9 months ago
lucidrains / token-shift-gpt
Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing
☆50Updated 3 years ago
prateeky2806 / ComPEFT
☆25Updated last year
peterbhase / SLAG-Belief-Updating
Code for paper "Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs"
☆28Updated 3 years ago
google-research-datasets / QAmeleon
QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning P…
☆34Updated last year
jxiw / BiGS
Official Repository of Pretraining Without Attention (BiGS), BiGS is the first model to achieve BERT-level transfer learning on the GLUE …
☆113Updated last year
yikangshen / megablocks
☆20Updated last year
allenai / bff
☆38Updated last year
guy-dar / embedding-space
☆54Updated 2 years ago
ClashLuke / tpucare
Automatically take good care of your preemptible TPUs
☆36Updated 2 years ago
srush / tangent
Source-to-Source Debuggable Derivatives in Pure Python
☆15Updated last year
crowsonkb / dice-mc
DiCE: The Infinitely Differentiable Monte-Carlo Estimator
☆31Updated last year
kyleliang919 / Long-context-transformers
Exploring finetuning public checkpoints on filter 8K sequences on Pile
☆115Updated 2 years ago
ethansmith2000 / TransformerExperiments
☆19Updated last month
sunyt32 / torchscale
Transformers at any scale
☆41Updated last year
cloneofsimo / zeroshampoo
☆34Updated 9 months ago
kaiokendev / cutoff-len-is-context-len
Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit
☆63Updated 2 years ago
lucidrains / autoregressive-linear-attention-cuda
CUDA implementation of autoregressive linear attention, with all the latest research findings
☆44Updated 2 years ago
dkopi / Bitune
Implementation of Bitune: Bidirectional Instruction-Tuning
☆19Updated last week
antofuller / configaformers
A python library for highly configurable transformers - easing model architecture search and experimentation.
☆49Updated 3 years ago
codekansas / rwkv
RWKV model implementation
☆38Updated last year