VectorInstitute / flex_modelLinks

☆13

Alternatives and similar repositories for flex_model

Users that are interested in flex_model are comparing it to the libraries listed below

Sorting:

hadasah / btm
☆74Updated last year
nathanhu0 / CaMeLS
Codebase for Context-aware Meta-learned Loss Scaling (CaMeLS). https://arxiv.org/abs/2305.15076.
☆25Updated last year
abhishekpanigrahi1996 / transformer_in_transformer
☆45Updated last year
mlfoundations / scaling
Language models scale reliably with over-training and on downstream tasks
☆97Updated last year
amazon-science / llm-interpret
Code for the ACL 2023 paper: "Rethinking the Role of Scale for In-Context Learning: An Interpretability-based Case Study at 66 Billion Sc…
☆30Updated last year
GSYfate / knnlm-limits
Official code repo for paper "Great Memory, Shallow Reasoning: Limits of kNN-LMs"
☆23Updated last month
sjelassi / transformers_ssm_copy
☆32Updated last year
HazyResearch / skill-it
Skill-It! A Data-Driven Skills Framework for Understanding and Training Language Models
☆46Updated last year
kaistAI / GAP
[ACL 2023] Gradient Ascent Post-training Enhances Language Model Generalization
☆29Updated 9 months ago
msakarvadia / AttentionLens
Interpretating the latent space representations of attention head outputs for LLMs
☆33Updated 10 months ago
Nix07 / finetuning
This repository contains the code used for the experiments in the paper "Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity…
☆27Updated last year
allenai / bff
☆38Updated last year
anadim / the-little-retrieval-test
☆34Updated 2 years ago
berlino / seq_icl
☆53Updated last year
TristanThrush / perplexity-correlations
Simple and scalable tools for data-driven pretraining data selection.
☆24Updated 2 weeks ago
srush / mamba-primer
☆37Updated last year
epfml / schedules-and-scaling
Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"
☆74Updated 7 months ago
guy-dar / embedding-space
☆54Updated 2 years ago
hamishivi / EasyLM
Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…
☆75Updated 10 months ago
krandiash / quinine
A library to create and manage configuration files, especially for machine learning projects.
☆78Updated 3 years ago
google-deepmind / streamingqa
☆48Updated last year
kaistAI / factual-knowledge-acquisition
☆19Updated last month
KoyenaPal / future-lens
Code and Data Repo for the CoNLL Paper -- Future Lens: Anticipating Subsequent Tokens from a Single Hidden State
☆18Updated last year
lucidrains / pause-transformer
Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…
☆54Updated last year
gregorbachmann / Next-Token-Failures
☆86Updated last year
JeanKaddour / NoTrainNoGain
Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)
☆80Updated last year
mmatena / model_merging
☆69Updated 3 years ago
tanyuqian / redco
NAACL '24 (Best Demo Paper RunnerUp) / MlSys @ NeurIPS '23 - RedCoast: A Lightweight Tool to Automate Distributed Training and Inference
☆66Updated 6 months ago
HazyResearch / aioli
Aioli: A unified optimization framework for language model data mixing
☆27Updated 5 months ago
UFO-101 / auto-circuit
A library for efficient patching and automatic circuit discovery.
☆67Updated 2 months ago