ayaka14732 / bart-base-jaxLinks
JAX implementation of the bart-base model
☆31Updated 2 years ago
Alternatives and similar repositories for bart-base-jax
Users that are interested in bart-base-jax are comparing it to the libraries listed below
Sorting:
- Exploring finetuning public checkpoints on filter 8K sequences on Pile☆116Updated 2 years ago
- QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning P…☆34Updated last year
- Implementation of the Mamba SSM with hf_integration.☆56Updated 10 months ago
- ☆35Updated last year
- Truly flash T5 realization!☆68Updated last year
- My explorations into editing the knowledge and memories of an attention network☆35Updated 2 years ago
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format☆27Updated 2 years ago
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…☆31Updated last year
- TrAVis: Visualise BERT attention in your browser☆59Updated 2 years ago
- ☆65Updated 10 months ago
- RWKV model implementation☆38Updated 2 years ago
- An English-to-Cantonese machine translation model☆52Updated 3 months ago
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit☆63Updated 2 years ago
- Official implementation of "GPT or BERT: why not both?"☆55Updated last month
- Utilities for Training Very Large Models☆58Updated 9 months ago
- ☆73Updated last month
- ☆51Updated last year
- Evaluating LLMs with Dynamic Data☆93Updated last month
- Code for the examples presented in the talk "Training a Llama in your backyard: fine-tuning very large models on consumer hardware" given…☆14Updated last year
- Implementation of the paper: "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" from Google in pyTO…☆55Updated 2 weeks ago
- Experiments with generating opensource language model assistants☆97Updated 2 years ago
- 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.☆82Updated 3 years ago
- HomebrewNLP in JAX flavour for maintable TPU-Training☆50Updated last year
- Code for ICML 25 paper "Metadata Conditioning Accelerates Language Model Pre-training (MeCo)"☆40Updated 2 weeks ago
- some common Huggingface transformers in maximal update parametrization (µP)☆81Updated 3 years ago
- A reimplementation of KOSMOS-1 from "Language Is Not All You Need: Aligning Perception with Language Models"☆27Updated 2 years ago
- An unofficial implementation of the Infini-gram model proposed by Liu et al. (2024)☆33Updated last year
- In-Context Alignment: Chat with Vanilla Language Models Before Fine-Tuning☆35Updated last year
- QLoRA with Enhanced Multi GPU Support☆37Updated last year
- Exploration into the proposed "Self Reasoning Tokens" by Felipe Bonetto☆56Updated last year