leloykun / mmsgLinks
Generate interleaved text and image content in a structured format you can directly pass to downstream APIs.
☆28Updated 8 months ago
Alternatives and similar repositories for mmsg
Users that are interested in mmsg are comparing it to the libraries listed below
Sorting:
- Minimum Description Length probing for neural network representations☆18Updated 4 months ago
- Engineering the state of RNN language models (Mamba, RWKV, etc.)☆32Updated last year
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆24Updated last year
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆57Updated 9 months ago
- A new way to generate large quantities of high quality synthetic data (on par with GPT-4), with better controllability, at a fraction of …☆22Updated 8 months ago
- Implementation of Spectral State Space Models☆16Updated last year
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated last year
- ☆25Updated last year
- This repo is based on https://github.com/jiaweizzhao/GaLore☆28Updated 9 months ago
- Code, results and other artifacts from the paper introducing the WildChat-50m dataset and the Re-Wild model family.☆29Updated 2 months ago
- Aioli: A unified optimization framework for language model data mixing☆27Updated 5 months ago
- Latent Large Language Models☆18Updated 10 months ago
- Fork of Flame repo for training of some new stuff in development☆14Updated last week
- Training hybrid models for dummies.☆23Updated 5 months ago
- ☆51Updated 7 months ago
- ☆63Updated 9 months ago
- ☆34Updated 9 months ago
- ☆79Updated 10 months ago
- Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.☆17Updated 3 months ago
- A repository for research on medium sized language models.☆76Updated last year
- Official implementation of "BERTs are Generative In-Context Learners"☆28Updated 3 months ago
- Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"☆38Updated 2 weeks ago
- Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…☆54Updated last year
- ☆35Updated last year
- An unofficial implementation of the Infini-gram model proposed by Liu et al. (2024)☆33Updated last year
- Code for the paper "Function-Space Learning Rates"☆20Updated 3 weeks ago
- Efficient encoder-decoder architecture for small language models (≤1B parameters) with cross-architecture knowledge distillation and visi…☆27Updated 4 months ago
- Experiments for efforts to train a new and improved t5☆77Updated last year
- This library supports evaluating disparities in generated image quality, diversity, and consistency between geographic regions.☆20Updated last year
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆30Updated 9 months ago